SLC7s as modifiers of the p53 pathway and methods of use

ABSTRACT

Human SLC7 genes are identified as modulators of the p53 pathway, and thus are therapeutic targets for disorders associated with defective p53 function. Methods for identifying modulators of p53, comprising screening for agents that modulate the activity of SLC7 are provided.

REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. provisional patent applications 60/296,076 filed Jun. 5, 2001, 60/328,605 filed Oct. 10, 2001, 60/338,733 filed Oct. 22, 2001, 60/357,253 filed Feb. 15, 2002, and 60/357,600 filed Feb. 15, 2002. The contents of the prior applications are hereby incorporated in their entirety.

BACKGROUND OF THE INVENTION

[0002] The p53 gene is mutated in over 50 different types of human cancers, including familial and spontaneous cancers, and is believed to be the most commonly mutated gene in human cancer (Zambetti and Levine, FASEB (1993) 7:855-865; Hollstein, et al., Nucleic Acids Res. (1994) 22:3551-3555). Greater than 90% of mutations in the p53 gene are missense mutations that alter a single amino acid that inactivates p53 function. Aberrant forms of human p53 are associated with poor prognosis, more aggressive tumors, metastasis, and short survival rates (Mitsudomi et al., Clin Cancer Res October 2000; 6(10):4055-63; Koshland, Science (1993) 262:1953).

[0003] The human p53 protein normally functions as a central integrator of signals including DNA damage, hypoxia, nucleotide deprivation, and oncogene activation (Prives, Cell (1998) 95:5-8). In response to these signals, p53 protein levels are greatly increased with the result that the accumulated p53 activates cell cycle arrest or apoptosis depending on the nature and strength of these signals. Indeed, multiple lines of experimental evidence have pointed to a key role for p53 as a tumor suppressor (Levine, Cell (1997) 88:323-331). For example, homozygous p53 “knockout” mice are developmentally normal but exhibit nearly 100% incidence of neoplasia in the first year of life (Donehower et al., Nature (1992) 356:215-221).

[0004] The biochemical mechanisms and pathways through which p53 functions in normal and cancerous cells are not fully understood, but one clearly important aspect of p53 function is its activity as a gene-specific transcriptional activator. Among the genes with known p53-response elements are several with well-characterized roles in either regulation of the cell cycle or apoptosis, including GADD45, p21[Waf1/Cip1, cyclin G, Bax, IGF-BP3, and MDM2 (Levine, Cell (1997) 88:323-331).

[0005] The transport of amino acids across cellular membranes is adapted to the needs of specific cells as well as to local and systemic requirements. For instance, active amino acid uptake is a necessity for growing cells. Various members of the novel family of glycoprotein-associated amino acid transporters, namely the light subunits of heterodimeric amino acid transporters or solute carrier family 7 (SLC7), have been identified and shown to play a role in cellular uptake and/or basolateral extrusion of basic and neutral amino acids (Rossier G et al. (1999) J Biol Chem, 274: 34948-34954). These permease-related proteins with 12 transmembrane domains require heterodimerization with a type II heavy chain glycoprotien such as 4F2hc (4F2 heavy chain) or rBAT to express their function. The association of glycoprotein-associated amino acid transporters with 4F2hc or possibly rBAT is a prerequisite for the transporters to reach the cell surface (Mastroberardino L, et al. (1998) Nature 395:288-291. In epithelial tissues, for example, trafficking of the 4F2hc subunit ensures a basolateral location, where the transporters allow the release of neutral or cationic amino acids into the blood (Broer A et al. (2000) Biochem. J. 349:787-795; Verrey F et al. (1999) J Membr Biol. 172:181-192; Christensen HN (1990) Physiol Rev. 70:43-77; Broer S (1998) Nova Acta Leopoldinana 306:79-91).

[0006] Members of the SLC7 family of transporters are evolutionarily conserved. Possible involvement of SLC7A5 (LAT1) in colon cancer has been reported (Wolf D et al (1996) Cancer Res. 56:5012-5022). SLC7A7 is implicated in LPI (lysinuric protein intolerance) (Torrents D et al (1999) Nature Genet 21:293-296; Borsani G et al (1999) Nature Genet 21:297-301). Further, members SLC7A9 and SLC7A10 are implicated in Cystinurea (Feliubadalo L et al. (1999) Nat Genet. 23:52-57; Leclerc D et al. (2001) Mol Genet Metab. 73:333-339).

[0007] The ability to manipulate the genomes of model organisms such as Drosophila provides a powerful means to analyze biochemical processes that, due to significant evolutionary conservation, has direct relevance to more complex vertebrate organisms. Due to a high level of gene and pathway conservation, the strong similarity of cellular processes, and the functional conservation of genes between these model organisms and mammals, identification of the involvement of novel genes in particular pathways and their functions in such model organisms can directly contribute to the understanding of the correlative pathways and methods of modulating them in mammals (see, for example, Mechler B M et al., 1985 EMBO J 4:1551-1557; Gateff E. 1982 Adv. Cancer Res. 37:33-74; Watson K L., et al., 1994 J Cell Sci. 18:19-33; Miklos G L, and Rubin G M. 1996 Cell 86:521-529; Wassarman D A, et al., 1995 Curr Opin Gen Dev 5:44-50; and Booth D R. 1999 Cancer Metastasis Rev. 18: 261-284). For example, a genetic screen can be carried out in an invertebrate model organism having underexpression (e.g. knockout) or overexpression of a gene (referred to as a “genetic entry point”) that yields a visible phenotype. Additional genes are mutated in a random or targeted manner. When a gene mutation changes the original phenotype caused by the mutation in the genetic entry point, the gene is identified as a “modifier” involved in the same or overlapping pathway as the genetic entry point. When the genetic entry point is an ortholog of a human gene implicated in a disease pathway, such as p53, modifier genes can be identified that may be attractive candidate targets for novel therapeutics.

[0008] All references cited herein, including sequence information in referenced Genbank identifier numbers and website references, are incorporated herein in their entireties.

SUMMARY OF THE INVENTION

[0009] We have discovered genes that modify the p53 pathway in Drosophila, and identified their human orthologs, hereinafter referred to as SLC7s. The invention provides methods for utilizing these p53 modifier genes and polypeptides to identify SLC7-modulating agents that are candidate therapeutic agents that can be used in the treatment of disorders associated with defective or impaired p53 function and/or SLC7 function. Preferred SLC7-modulating agents specifically bind to SLC7 polypeptides and restore p53 function. Other preferred SLC7-modulating agents are nucleic acid modulators such as antisense oligomers and RNAi that repress SLC7 gene expression or product activity by, for example, binding to and inhibiting the respective nucleic acid (i.e. DNA or mRNA).

[0010] SLC7-modulating agents may be evaluated by any convenient in vitro or in vivo assay for molecular interaction with an SLC7 polypeptide or nucleic acid. In one embodiment, candidate SLC7-modulating agents are tested with an assay system comprising a SLC7 polypeptide or nucleic acid. In one preferred embodiment, the SLC7 polypeptide or nucleic acid is SLC7A5 or SLC7A11. Agents that produce a change in the activity of the assay system relative to controls are identified as candidate p53 modulating agents. The assay system may be cell-based or cell-free. SLC7-modulating agents include, but are not limited to, SLC7 related proteins (e.g. dominant negative mutants, and biotherapeutics); SLC7-specific antibodies; SLC7-specific antisense oligomers and other nucleic acid modulators; and chemical agents that specifically bind to or interact with SLC7 ( e.g. by binding to an SLC7 binding partner). In one specific embodiment, a small molecule modulator is identified using a transporter assay. In specific embodiments, the screening assay system is selected from a binding assay, an apoptosis assay, a cell proliferation assay, an angiogenesis assay, and a hypoxic induction assay.

[0011] In another embodiment, candidate p53 pathway modulating agents are further tested using a second assay system that detects changes in the p53 pathway, such as angiogenic, apoptotic, or cell proliferation changes produced by the originally identified candidate agent or an agent derived from the original agent. The second assay system may use cultured cells or non-human animals. In specific embodiments, the secondary assay system uses non-human animals, including animals predetermined to have a disease or disorder implicating the p53 pathway, such as an angiogenic, apoptotic, or cell proliferation disorder (e.g. cancer).

[0012] The invention further provides methods for modulating SLC7 function and/or the p53 pathway in a mammalian cell by contacting the mammalian cell with an agent that specifically binds a SLC7 polypeptide or nucleic acid. In a preferred embodiment, the SLC7 polypeptide or nucleic acid is SLC7A5 or SLC7A11. The agent may be a small molecule modulator, a nucleic acid modulator, or an antibody and may be administered to a mammalian animal predetermined to have a pathology associated the p53 pathway.

DETAILED DESCRIPTION OF THE INVENTION

[0013] Genetic screens were designed to identify modifiers of the p53 pathway in Drosophila, in which p53 was overexpressed in the wing (Ollmann M, et al., Cell 2000 101:91-101). The CG1607 gene was identified as a modifier of the p53 pathway. Accordingly, vertebrate orthologs of these modifiers, and preferably the human orthologs, SLC7 genes (i.e., nucleic acids and polypeptides) are attractive drug targets for the treatment of pathologies associated with a defective p53 signaling pathway, such as cancer.

[0014] In vitro and in vivo methods of assessing SLC7 function are provided herein. Modulation of the SLC7 or their respective binding partners is useful for understanding the association of the p53 pathway and its members in normal and disease conditions and for developing diagnostics and therapeutic modalities for p53 related pathologies. SLC7-modulating agents that act by inhibiting or enhancing SLC7 expression, directly or indirectly, for example, by affecting an SLC7 function such as transport or binding activity, can be identified using methods provided herein. SLC7 modulating agents are useful in diagnosis, therapy and pharmaceutical development.

[0015] Nucleic Acids and Polypeptides of the Invention

[0016] Sequences related to SLC7 nucleic acids and polypeptides that can be used in the invention are disclosed in Genbank (referenced by Genbank identifier (GI) number) as GI#s 13649338 (SEQ ID NO: 1), 19923169 (SEQ ID NO: 2), 14424513 (SEQ ID NO: 4), 3639057 (SEQ ID NO: 5), 181907 (SEQ ID NO: 6), 7706245 (SEQ ID NO: 7), 13647529 (SEQ ID NO: 8), 4507052 (SEQ ID NO: 9), 13111751 (SEQ ID NO: 11), 6642957 (SEQ ID NO: 13), 6642959 (SEQ ID NO: 14), 6912269 (SEQ ID NO: 15), 9187132 (SEQ ID NO: 17), 9187131 (SEQ ID NO: 18), 6179884 (SEQ ID NO: 19), 4581469 (SEQ ID NO: 20), 5823977 (SEQ ID NO: 21), 1476395 (SEQ ID NO: 22), 7657590 (SEQ ID NO: 23), 10863043 (SEQ ID NO: 25), 9790234 (SEQ ID NO: 26), 18490890 (SEQ ID NO: 28), 13924719 (SEQ ID NO: 29), 7657682 (SEQ ID NO: 30), 18141306 (SEQ ID NO: 31), 18557166 (SEQ ID NO: 32), 13516845 (SEQ ID NO: 34), 13654279 (SEQ ID NO: 35), and 14775898 (SEQ ID NO: 36) for nucleic acid, and GI#s 11432073 (SEQ ID NO: 37), 4519803 (SEQ ID NO: 38), 7706246 (SEQ ID NO: 39), 4507053 (SEQ ID NO: 40), 13111752 (SEQ ID NO: 41), 4507055 (SEQ ID NO: 42), 12643348 (SEQ ID NO: 43), 14751168 (SEQ ID NO: 44), 5823978 (SEQ ID NO: 45), 7657591 (SEQ ID NO: 46), 9790235 (SEQ ID NO: 47), 13924720 (SEQ ID NO: 48), 7657683 (SEQ ID NO: 49), and 13654280 (SEQ ID NO: 50) for polypeptides. Additionally, nucleic acid sequences of SEQ ID NOs: 3, 10, 12, 16, 24, 27, 33, and 51-53, and amino acid sequences of SEQ ID NO: 54 can also be used in the invention.

[0017] SLC7s are membrane transport proteins with permease domains. The term “SLC7 polypeptide” refers to a full-length SLC7 protein or a functionally active fragment or derivative thereof. A “functionally active” SLC7 fragment or derivative exhibits one or more functional activities associated with a full-length, wild-type SLC7 protein, such as antigenic or immunogenic activity, enzymatic activity, ability to bind natural cellular substrates, etc. The functional activity of SLC7 proteins, derivatives and fragments can be assayed by various methods known to one skilled in the art (Current Protocols in Protein Science (1998) Coligan et al., eds., John Wiley & Sons, Inc., Somerset, N.J.) and as further discussed below. For purposes herein, functionally active fragments also include those fragments that comprise one or more structural domains of an SLC7, such as a permease domain or a binding domain. Protein domains can be identified using the PFAM program (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2; http://pfam.wustl.edu). For example, the permease domain of SLC7A5 from GI# 4519803 (SEQ ID NO: 38) is located at approximately amino acid residues 46-481 (PFAM 00324). Methods for obtaining SLC7 polypeptides are also further described below. In some embodiments, preferred fragments are functionally active, domain-containing fragments comprising at least 25 contiguous amino acids, preferably at least 50, more preferably 75, and most preferably at least 100 contiguous amino acids of any one of SEQ ID NOs: 37 through 50 (an SLC7). In further preferred embodiments, the fragment comprises the entire permease (functionally active) domain.

[0018] The term “SLC7 nucleic acid” refers to a DNA or RNA molecule that encodes a SLC7 polypeptide. Preferably, the SLC7 polypeptide or nucleic acid or fragment thereof is from a human, but can also be an ortholog, or derivative thereof with at least 70% sequence identity, preferably at least 80%, more preferably 85%, still more preferably 90%, and most preferably at least 95% sequence identity with SLC7. Normally, orthologs in different species retain the same function, due to presence of one or more protein motifs and/or 3-dimensional structures. Orthologs are generally identified by sequence homology analysis, such as BLAST analysis, usually using protein bait sequences. Sequences are assigned as a potential ortholog if the best hit sequence from the forward BLAST result retrieves the original query sequence in the reverse BLAST (Huynen M A and Bork P, Proc Natl Acad Sci (1998) 95:5849-5856; Huynen M A et al., Genome Research (2000) 10:1204-1210). Programs for multiple sequence alignment, such as CLUSTAL (Thompson J D et al, 1994, Nucleic Acids Res 22:4673-4680) may be used to highlight conserved regions and/or residues of orthologous proteins and to generate phylogenetic trees. In a phylogenetic tree representing multiple homologous sequences from diverse species (e.g., retrieved through BLAST analysis), orthologous sequences from two species generally appear closest on the tree with respect to all other sequences from these two species. Structural threading or other analysis of protein folding (e.g., using software by ProCeryon, Biosciences, Salzburg, Austria) may also identify potential orthologs. In evolution, when a gene duplication event follows speciation, a single gene in one species, such as Drosophila, may correspond to multiple genes (paralogs) in another, such as human. As used herein, the term “orthologs” encompasses paralogs. As used herein, “percent (%) sequence identity” with respect to a subject sequence, or a specified portion of a subject sequence, is defined as the percentage of nucleotides or amino acids in the candidate derivative sequence identical with the nucleotides or amino acids in the subject sequence (or specified portion thereof), after aligning the sequences and introducing gaps, if necessary to achieve the maximum percent sequence identity, as generated by the program WU-BLAST-2.0a19 (Altschul et al., J. Mol. Biol. (1997) 215:403-410; http://blast.wustl.edu/blast/README.html) with all the search parameters set to default values. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. A % identity value is determined by the number of matching identical nucleotides or amino acids divided by the sequence length for which the percent identity is being reported. “Percent (%) amino acid sequence similarity” is determined by doing the same calculation as for determining % amino acid sequence identity, but including conservative amino acid substitutions in addition to identical amino acids in the computation.

[0019] A conservative amino acid substitution is one in which an amino acid is substituted for another amino acid having similar properties such that the folding or activity of the protein is not significantly affected. Aromatic amino acids that can be substituted for each other are phenylalanine, tryptophan, and tyrosine; interchangeable hydrophobic amino acids are leucine, isoleucine, methionine, and valine; interchangeable polar amino acids are glutamine and asparagine; interchangeable basic amino acids are arginine, lysine and histidine; interchangeable acidic amino acids are aspartic acid and glutamic acid; and interchangeable small amino acids are alanine, serine, threonine, cysteine and glycine.

[0020] Alternatively, an alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman (Smith and Waterman, 1981, Advances in Applied Mathematics 2:482-489; database: European Bioinformatics Institute http://www.ebi.ac.uk/MPsrch/; Smith and Waterman, 1981, J. of Molec.Biol., 147:195-197; Nicholas et al., 1998, “A Tutorial on Searching Sequence Databases and Sequence Scoring Methods” (www.psc.edu) and references cited therein.; W. R. Pearson, 1991, Genomics 11:635-650). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff (Dayhoff: Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research Foundation, Washington, D.C., USA), and normalized by Gribskov (Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). The Smith-Waterman algorithm may be employed where default parameters are used for scoring (for example, gap open penalty of 12, gap extension penalty of two). From the data generated, the “Match” value reflects “sequence identity.”

[0021] Derivative nucleic acid molecules of the subject nucleic acid molecules include sequences that hybridize to the nucleic acid sequence of any of SEQ ID NOs: 1 through 36. The stringency of hybridization can be controlled by temperature, ionic strength, pH, and the presence of denaturing agents such as formamide during hybridization and washing. Conditions routinely used are set out in readily available procedure texts (e.g., Current Protocol in Molecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons, Publishers (1994); Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). In some embodiments, a nucleic acid molecule of the invention is capable of hybridizing to a nucleic acid molecule containing the nucleotide sequence of any one of SEQ ID NOs: 1 through 36 under stringent hybridization conditions that comprise: prehybridization of filters containing nucleic acid for 8 hours to overnight at 65° C. in a solution comprising 6× single strength citrate (SSC) (1× SSC is 0.15 M NaCl, 0.015 M Na citrate; pH 7.0), 5× Denhardt's solution, 0.05% sodium pyrophosphate and 100 μg/ml herring sperm DNA; hybridization for 18-20 hours at 65° C. in a solution containing 6× SSC, 1× Denhardt's solution, 100 μg/ml yeast tRNA and 0.05% sodium pyrophosphate; and washing of filters at 65° C. for 1 h in a solution containing 0.2× SSC and 0.1% SDS (sodium dodecyl sulfate).

[0022] In other embodiments, moderately stringent hybridization conditions are used that comprise: pretreatment of filters containing nucleic acid for 6 h at 40° C. in a solution containing 35% formamide, 5× SSC, 50 mM Tris-HCl (pH7.5), 5mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 μg/ml denatured salmon sperm DNA; hybridization for 18-20 h at 40° C. in a solution containing 35% formamide, 5× SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 μg/ml salmon sperm DNA, and 10% (wt/vol) dextran sulfate; followed by washing twice for 1 hour at 55° C. in a solution containing 2× SSC and 0.1% SDS.

[0023] Alternatively, low stringency conditions can be used that comprise: incubation for 8 hours to overnight at 37° C. in a solution comprising 20% formamide, 5× SSC, 50 mM sodium phosphate (pH 7.6), 5× Denhardt's solution, 10% dextran sulfate, and 20 μg/ml denatured sheared salmon sperm DNA; hybridization in the same buffer for 18 to 20 hours; and washing of filters in 1× SSC at about 37° C. for 1 hour.

[0024] Isolation, Production, Expression, and Mis-expression of SLC7 Nucleic Acids and Polypeptides

[0025] SLC7 nucleic acids and polypeptides, useful for identifying and testing agents that modulate SLC7 function and for other applications related to the involvement of SLC7 in the p53 pathway. SLC7 nucleic acids and derivatives and orthologs thereof may be obtained using methods known to those skilled in the art. For instance, techniques for isolating cDNA or genomic DNA sequences of interest by screening DNA libraries or by using polymerase chain reaction (PCR) are well known in the art. In general, the particular use for the protein will dictate the particulars of expression, production, and purification methods. For instance, production of proteins for use in screening for modulating agents may require methods that preserve specific biological activities of these proteins, whereas production of proteins for antibody generation may require structural integrity of particular epitopes. Expression of proteins to be purified for screening or antibody production may require the addition of specific tags (e.g., generation of fusion proteins). Overexpression of an SLC7 protein for assays used to assess SLC7 function, such as involvement in cell cycle regulation or hypoxic response, may require expression in eukaryotic cell lines capable of these cellular activities. Techniques for the expression, production, and purification of proteins are well known in the art; any suitable means therefore may be used (e.g., Higgins S J and Hames B D (eds.) Protein Expression: A Practical Approach, Oxford University Press Inc., New York 1999; Stanbury P F et al., Principles of Fermentation Technology, 2^(nd) edition, Elsevier Science, New York, 1995; Doonan S (ed.) Protein Purification Protocols, Bumana Press, New Jersey, 1996; Coligan J E et al, Current Protocols in Protein Science (eds.), 1999, John Wiley & Sons, New York). In particular embodiments, recombinant SLC7 is expressed in a cell line known to have defective p53 function (e.g. SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29 and DLD-1 colon cancer cells, among others, available from American Type Culture Collection (ATCC), Manassas, Va.). The recombinant cells are used in cell-based screening assay systems of the invention, as described further below.

[0026] The nucleotide sequence encoding an SLC7 polypeptide can be inserted into any appropriate expression vector. The necessary transcriptional and translational signals, including promoter/enhancer element, can derive from the native SLC7 gene and/or its flanking regions or can be heterologous. A variety of host-vector expression systems may be utilized, such as mammalian cell systems infected with virus (e.g. vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g. baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed with bacteriophage, plasmid, or cosmid DNA. A host cell strain that modulates the expression of, modifies, and/or specifically processes the gene product may be used.

[0027] To detect expression of the SLC7 gene product, the expression vector can comprise a promoter operably linked to an SLC7 gene nucleic acid, one or more origins of replication, and, one or more selectable markers (e.g. thymidine kinase activity, resistance to antibiotics, etc.). Alternatively, recombinant expression vectors can be identified by assaying for the expression of the SLC7 gene product based on the physical or functional properties of the SLC7 protein in in vitro assay systems (e.g. immunoassays).

[0028] The SLC7 protein, fragment, or derivative may be optionally expressed as a fusion, or chimeric protein product (i.e. it is joined via a peptide bond to a heterologous protein sequence of a different protein), for example to facilitate purification or detection. A chimeric product can be made by ligating the appropriate nucleic acid sequences encoding the desired amino acid sequences to each other using standard methods and expressing the chimeric product. A chimeric product may also be made by protein synthetic techniques, e.g. by use of a peptide synthesizer (Hunkapiller et al., Nature (1984) 310:105-111).

[0029] Once a recombinant cell that expresses the SLC7 gene sequence is identified, the gene product can be isolated and purified using standard methods (e.g. ion exchange, affinity, and gel exclusion chromatography; centrifugation; differential solubility; electrophoresis, cite purification reference). Alternatively, native SLC7 proteins can be purified from natural sources, by standard methods (e.g. immunoaffinity purification). Once a protein is obtained, it may be quantified and its activity measured by appropriate methods, such as immunoassay, bioassay, or other measurements of physical properties, such as crystallography.

[0030] The methods of this invention may also use cells that have been engineered for altered expression (mis-expression) of SLC7 or other genes associated with the p53 pathway. As used herein, mis-expression encompasses ectopic expression, over-expression, under-expression, and non-expression (e.g. by gene knock-out or blocking expression that would otherwise normally occur).

[0031] Genetically Modified Animals

[0032] Animal models that have been genetically modified to alter SLC7 expression may be used in in vivo assays to test for activity of a candidate p53 modulating agent, or to further assess the role of SLC7 in a p53 pathway process such as apoptosis or cell proliferation. Preferably, the altered SLC7 expression results in a detectable phenotype, such as decreased or increased levels of cell proliferation, angiogenesis, or apoptosis compared to control animals having normal SLC7 expression. The genetically modified animal may additionally have altered p53 expression (e.g. p53 knockout). Preferred genetically modified animals are mammals such as primates, rodents (preferably mice), cows, horses, goats, sheep, pigs, dogs and cats. Preferred non-mammalian species include zebrafish, C. elegans, and Drosophila. Preferred genetically modified animals are transgenic animals having a heterologous nucleic acid sequence present as an extrachromosomal element in a portion of its cells, i.e. mosaic animals (see, for example, techniques described by Jakobovits, 1994, Curr. Biol. 4:761-763.) or stably integrated into its germ line DNA (i.e., in the genomic sequence of most or all of its cells). Heterologous nucleic acid is introduced into the germ line of such transgenic animals by genetic manipulation of, for example, embryos or embryonic stem cells of the host animal.

[0033] Methods of making transgenic animals are well-known in the art (for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci. USA 82: 4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al., and Hogan, B., Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1986); for particle bombardment see U.S. Pat. No., 4,945,050, by Sandford et al.; for transgenic Drosophila see Rubin and Spradling, Science (1982) 218:348-53 and U.S. Pat. No. 4,670,388; for transgenic insects see Berghammer A. J. et al., A Universal Marker for Transgenic Insects (1999) Nature 402:370-371; for transgenic Zebrafish see Lin S., Transgenic Zebrafish, Methods Mol Biol. (2000);136:375-3830); for microinjection procedures for fish, amphibian eggs and birds see Houdebine and Chourrout, Experientia (1991) 47:897-905; for transgenic rats see Hammer et al., Cell (1990) 63:1099-1112; and for culturing of embryonic stem (ES) cells and the subsequent production of transgenic animals by the introduction of DNA into ES cells using methods such as electroporation, calcium phosphate/DNA precipitation and direct injection see, e.g., Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press (1987)). Clones of the nonhuman transgenic animals can be produced according to available methods (see Wilmut, I. et al. (1997) Nature 385:810-813; and PCT International Publication Nos. WO 97/07668 and WO 97/07669).

[0034] In one embodiment, the transgenic animal is a “knock-out” animal having a heterozygous or homozygous alteration in the sequence of an endogenous SLC7 gene that results in a decrease of SLC7 function, preferably such that SLC7 expression is undetectable or insignificant. Knock-out animals are typically generated by homologous recombination with a vector comprising a transgene having at least a portion of the gene to be knocked out. Typically a deletion, addition or substitution has been introduced into the transgene to functionally disrupt it. The transgene can be a human gene (e.g., from a human genomic clone) but more preferably is an ortholog of the human gene derived from the transgenic host species. For example, a mouse SLC7 gene is used to construct a homologous recombination vector suitable for altering an endogenous SLC7 gene in the mouse genome. Detailed methodologies for homologous recombination in mice are available (see Capecchi, Science (1989) 244:1288-1292; Joyner et al., Nature (1989) 338:153-156). Procedures for the production of non-rodent transgenic mammals and other animals are also available (Houdebine and Chourrout, supra; Pursel et al., Science (1989)244:1281-1288; Simms et al., Bio/Technology (1988)6:179-183). In a preferred embodiment, knock-out animals, such as mice harboring a knockout of a specific gene, may be used to produce antibodies against the human counterpart of the gene that has been knocked out (Claesson MI et al., (1994) Scan J Immunol 40:257-264; Declerck P J et al., (1995) J Biol Chem. 270:8397-400).

[0035] In another embodiment, the transgenic animal is a “knock-in” animal having an alteration in its genome that results in altered expression (e.g., increased (including ectopic) or decreased expression) of the SLC7 gene, e.g., by introduction of additional copies of SLC7, or by operatively inserting a regulatory sequence that provides for altered expression of an endogenous copy of the SLC7 gene. Such regulatory sequences include inducible, tissue-specific, and constitutive promoters and enhancer elements. The knock-in can be homozygous or heterozygous.

[0036] Transgenic nonhuman animals can also be produced that contain selected systems allowing for regulated expression of the transgene. One example of such a system that may be produced is the cre/loxP recombinase system of bacteriophage P1 (Lakso et al., PNAS (1992) 89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein are required. Such animals can be provided through the construction of “double” transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase. Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355; U.S. Pat. No. 5,654,182). In a preferred embodiment, both Cre-LoxP and Flp-Frt are used in the same system to regulate expression of the transgene, and for sequential deletion of vector sequences in the same cell (Sun X et al (2000) Nat Genet 25:83-6).

[0037] The genetically modified animals can be used in genetic studies to further elucidate the p53 pathway, as animal models of disease and disorders implicating defective p53 function, and for in vivo testing of candidate therapeutic agents, such as those identified in screens described below. The candidate therapeutic agents are administered to a genetically modified animal having altered SLC7 function and phenotypic changes are compared with appropriate control animals such as genetically modified animals that receive placebo treatment, and/or animals with unaltered SLC7 expression that receive candidate therapeutic agent.

[0038] In addition to the above-described genetically modified animals having altered SLC7 function, animal models having defective p53 function (and otherwise normal SLC7 function), can be used in the methods of the present invention. For example, a p53 knockout mouse can be used to assess, in vivo, the activity of a candidate p53 modulating agent identified in one of the in vitro assays described below. p53 knockout mice are described in the literature (Jacks et al., Nature 2001;410:1111-1116, 1043-1044; Donehower et al., supra). Preferably, the candidate p53 modulating agent when administered to a model system with cells defective in p53 function, produces a detectable phenotypic change in the model system indicating that the p53 function is restored, i.e., the cells exhibit normal cell cycle progression.

[0039] Modulating Agents

[0040] The invention provides methods to identify agents that interact with and/or modulate the function of SLC7 and/or the p53 pathway. Modulating agents identified by these methods are also part of the invention. Such agents are useful in a variety of diagnostic and therapeutic applications associated with the p53 pathway, as well as in further analysis of the SLC7 protein and its contribution to the p53 pathway. Accordingly, the invention also provides methods for modulating the p53 pathway comprising the step of specifically modulating SLC7 activity by administering a SLC7-interacting or modulating agent.

[0041] As used herein, an “SLC7-modulating agent” is any agent that modulates SLC7 function, for example, an agent that interacts with SLC7 to inhibit or enhance SLC7 activity or otherwise affect normal SLC7 function. SLC7 function can be affected at any level, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In a preferred embodiment, the SLC7-modulating agent specifically modulates the function of the SLC7. The phrases “specific modulating agent”, “specifically modulates”, etc., are used herein to refer to modulating agents that directly bind to the SLC7 polypeptide or nucleic acid, and preferably inhibit, enhance, or otherwise alter, the function of the SLC7. These phrases also encompass modulating agents that alter the interaction of the SLC7 with a binding partner, substrate, or cofactor (e.g. by binding to a binding partner of an SLC7, or to a protein/binding partner complex, and altering SLC7 function). In a further preferred embodiment, the SLC7-modulating agent is a modulator of the p53 pathway (e.g. it restores and/or up-regulates p53 function), and thus is also a “p53 modulating agent”.

[0042] Preferred SLC7-modulating agents include small molecule compounds; SLC7-interacting proteins, including antibodies and other biotherapeutics; and nucleic acid modulators such as antisense and RNA inhibitors. The modulating agents may be formulated in pharmaceutical compositions, for example, as compositions that may comprise other active ingredients, as in combination therapy, and/or suitable carriers or excipients. Techniques for formulation and administration of the compounds may be found in “Remington's Pharmaceutical Sciences” Mack Publishing Co., Easton, Pa., 19^(th) edition.

Small Molecule Modulators

[0043] Small molecules, are often preferred to modulate function of proteins with enzymatic function, and/or containing protein interaction domains. Chemical agents, referred to in the art as “small molecule” compounds are typically organic, non-peptide molecules, having a molecular weight less than 10,000, preferably less than 5,000, more preferably less than 1,000, and most preferably less than 500. This class of modulators includes chemically synthesized molecules, for instance, compounds from combinatorial chemical libraries. Synthetic compounds may be rationally designed or identified based on known or inferred properties of the SLC7 protein or may be identified by screening compound libraries. Alternative appropriate modulators of this class are natural products, particularly secondary metabolites from organisms such as plants or fungi, which can also be identified by screening compound libraries for SLC7-modulating activity. Methods for generating and obtaining compounds are well known in the art (Schreiber S L, Science (2000) 151: 1964-1969; Radmann J and Gunther J, Science (2000) 151:1947-1948).

[0044] Small molecule modulators identified from screening assays, as described below, can be used as lead compounds from which candidate clinical compounds may be designed, optimized, and synthesized. Such clinical compounds may have utility in treating pathologies associated with the p53 pathway. The activity of candidate small molecule modulating agents may be improved several-fold through iterative secondary functional validation, as further described below, structure determination, and candidate modulator modification and testing. Additionally, candidate clinical compounds are generated with specific regard to clinical and pharmacological properties. For example, the reagents may be derivatized and re-screened using in vitro and in vivo assays to optimize activity and minimize toxicity for pharmaceutical development.

Protein Modulators

[0045] Specific SLC7-interacting proteins are useful in a variety of diagnostic and therapeutic applications related to the p53 pathway and related disorders, as well as in validation assays for other SLC7-modulating agents. In a preferred embodiment, SLC7-interacting proteins affect normal SLC7 function, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In another embodiment, SLC7-interacting proteins are useful in detecting and providing information about the function of SLC7 proteins, as is relevant to p53 related disorders, such as cancer (e.g., for diagnostic means).

[0046] An SLC7-interacting protein may be endogenous, i.e. one that naturally interacts genetically or biochemically with an SLC7, such as a member of the SLC7 pathway that modulates SLC7 expression, localization, and/or activity. SLC7-modulators include dominant negative forms of SLC7-interacting proteins and of SLC7 proteins themselves. Yeast two-hybrid and variant screens offer preferred methods for identifying endogenous SLC7-interacting proteins (Finley, R. L. et al. (1996) in DNA Cloning-Expression Systems: A Practical Approach, eds. Glover D. & Hames B. D (Oxford University Press, Oxford, England), pp. 169-203; Fashema SF et al., Gene (2000) 250:1-14; Drees B L Curr Opin Chem Biol (1999) 3:64-70; Vidal M and Legrain P Nucleic Acids Res (1999) 27:919-29; and U.S. Pat. No. 5,928,868). Mass spectrometry is an alternative preferred method for the elucidation of protein complexes (reviewed in, e.g., Pandley A and Mann M, Nature (2000) 405:837-846; Yates J R ₃rd, Trends Genet (2000) 16:5-8).

[0047] An SLC7-interacting protein may be an exogenous protein, such as an SLC7-specific antibody or a T-cell antigen receptor (see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory; Harlow and Lane (1999) Using antibodies: a laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press). SLC7 antibodies are further discussed below.

[0048] In preferred embodiments, an SLC7-interacting protein specifically binds an SLC7 protein. In alternative preferred embodiments, an SLC7-modulating agent binds an SLC7 substrate, binding partner, or cofactor.

Antibodies

[0049] In another embodiment, the protein modulator is an SLC7 specific antibody agonist or antagonist. The antibodies have therapeutic and diagnostic utilities, and can be used in screening assays to identify SLC7 modulators. The antibodies can also be used in dissecting the portions of the SLC7 pathway responsible for various cellular responses and in the general processing and maturation of the SLC7.

[0050] Antibodies that specifically bind SLC7 polypeptides can be generated using known methods. Preferably the antibody is specific to a mammalian ortholog of SLC7 polypeptide, and more preferably, to human SLC7. Antibodies may be polyclonal, monoclonal (mAbs), humanized or chimeric antibodies, single chain antibodies, Fab fragments, F(ab′).sub.2 fragments, fragments produced by a FAb expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any of the above. Epitopes of SLC7 which are particularly antigenic can be selected, for example, by routine screening of SLC7 polypeptides for antigenicity or by applying a theoretical method for selecting antigenic regions of a protein (Hopp and Wood (1981), Proc. Nati. Acad. Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol. 20:483-89; Sutcliffe et al., (1983) Science 219:660-66) to the amino acid sequence shown in any of SEQ ID NOs: 37 through 50. Monoclonal antibodies with affinities of 10⁸ M⁻¹ preferably 10⁹ M⁻¹ to 10¹⁰ M⁻¹, or stronger can be made by standard procedures as described (Harlow and Lane, supra; Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed) Academic Press, New York; and U.S. Pat. Nos. 4,381,292; 4,451,570; and 4,618,577). Antibodies may be generated against crude cell extracts of SLC7 or substantially purified fragments thereof. If SLC7 fragments are used, they preferably comprise at least 10, and more preferably, at least 20 contiguous amino acids of an SLC7 protein. In a particular embodiment, SLC7-specific antigens and/or immunogens are coupled to carrier proteins that stimulate the immune response. For example, the subject polypeptides are covalently coupled to the keyhole limpet hemocyanin (KLH) carrier, and the conjugate is emulsified in Freund's complete adjuvant, which enhances the immune response. An appropriate immune system such as a laboratory rabbit or mouse is immunized according to conventional protocols.

[0051] The presence of SLC7-specific antibodies is assayed by an appropriate assay such as a solid phase enzyme-linked immunosorbant assay (ELISA) using immobilized corresponding SLC7 polypeptides. Other assays, such as radioimmunoassays or fluorescent assays might also be used.

[0052] Chimeric antibodies specific to SLC7 polypeptides can be made that contain different portions from different animal species. For instance, a human immunoglobulin constant region may be linked to a variable region of a murine mAb, such that the antibody derives its biological activity from the human antibody, and its binding specificity from the murine fragment. Chimeric antibodies are produced by splicing together genes that encode the appropriate regions from each species (Morrison et al., Proc. Natl. Acad. Sci. (1984) 81:6851-6855; Neuberger et al., Nature (1984) 312:604-608; Takeda et al., Nature (1985) 31:452-454). Humanized antibodies, which are a form of chimeric antibodies, can be generated by grafting complementary-determining regions (CDRs) (Carlos, T. M., J. M. Harlan. 1994. Blood 84:2068-2101) of mouse antibodies into a background of human framework regions and constant regions by recombinant DNA technology (Riechmann L M, et al., 1988 Nature 323: 323-327). Humanized antibodies contain ˜10% murine sequences and ˜90% human sequences, and thus further reduce or eliminate immunogenicity, while retaining the antibody specificities (Co MS, and Queen C. 1991 Nature 351: 501-501; Morrison S L. 1992 Ann. Rev. Immun. 10:239-265). Humanized antibodies and methods of their production are well-known in the art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762, and 6,180,370).

[0053] SLC7-specific single chain antibodies which are recombinant, single chain polypeptides formed by linking the heavy and light chain fragments of the Fv regions via an amino acid bridge, can be produced by methods known in the art (U.S. Pat. No. 4,946,778; Bird, Science (1988) 242:423-426; Huston et al., Proc. Natl. Acad. Sci. USA (1988) 85:5879-5883; and Ward et al., Nature (1989) 334:544-546).

[0054] Other suitable techniques for antibody production involve in vitro exposure of lymphocytes to the antigenic polypeptides or alternatively to selection of libraries of antibodies in phage or similar vectors (Huse et al., Science (1989) 246:1275-1281). As used herein, T-cell antigen receptors are included within the scope of antibody modulators (Harlow and Lane, 1988, supra).

[0055] The polypeptides and antibodies of the present invention may be used with or without modification. Frequently, antibodies will be labeled by joining, either covalently or non-covalently, a substance that provides for a detectable signal, or that is toxic to cells that express the targeted protein (Menard S, et al., Int J. Biol Markers (1989) 4:131-134). A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, fluorescent emitting lanthanide metals, chemiluminescent moieties, bioluminescent moieties, magnetic particles, and the like (U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241). Also, recombinant immunoglobulins may be produced (U.S. Pat. No. 4,816,567). Antibodies to cytoplasmic polypeptides may be delivered and reach their targets by conjugation with membrane-penetrating toxin proteins (U.S. Pat. No. 6,086,900).

[0056] When used therapeutically in a patient, the antibodies of the subject invention are typically administered parenterally, when possible at the target site, or intravenously. The therapeutically effective dose and dosage regimen is determined by clinical studies. Typically, the amount of antibody administered is in the range of about 0.1 mg/kg-to about 10 mg/kg of patient weight. For parenteral administration, the antibodies are formulated in a unit dosage injectable form (e.g., solution, suspension, emulsion) in association with a pharmaceutically acceptable vehicle. Such vehicles are inherently nontoxic and non-therapeutic. Examples are water, saline, Ringer's solution, dextrose solution, and 5% human serum albumin. Nonaqueous vehicles such as fixed oils, ethyl oleate, or liposome carriers may also be used. The vehicle may contain minor amounts of additives, such as buffers and preservatives, which enhance isotonicity and chemical stability or otherwise enhance therapeutic potential. The antibodies' concentrations in such vehicles are typically in the range of about 1 mg/ml to about10 mg/ml. Immunotherapeutic methods are further described in the literature (U.S. Pat. No. 5,859,206; WO0073469).

Specific Biotherapeutics

[0057] In a preferred embodiment, an SLC7-interacting protein may have biotherapeutic applications. Biotherapeutic agents formulated in pharmaceutically acceptable carriers and dosages may be used to activate or inhibit signal transduction pathways. This modulation may be accomplished by binding a ligand, thus inhibiting the activity of the pathway; or by binding a receptor, either to inhibit activation of, or to activate, the receptor. Alternatively, the biotherapeutic may itself be a ligand capable of activating or inhibiting a receptor. Biotherapeutic agents and methods of producing them are described in detail in U.S. Pat. No. 6,146,628.

[0058] The SLC7 ligand(s), antibodies to the ligand(s) or the SLC7 itself may be used as biotherapeutics to modulate the activity of SLC7 in the p53 pathway.

Nucleic Acid Modulators

[0059] Other preferred SLC7-modulating agents comprise nucleic acid molecules, such as antisense oligomers or double stranded RNA (dsRNA), which generally inhibit SLC7 activity. Preferred nucleic acid modulators interfere with the function of the SLC7 nucleic acid such as DNA replication, transcription, translocation of the SLC7 RNA to the site of protein translation, translation of protein from the SLC7 RNA, splicing of the SLC7 RNA to yield one or more mRNA species, or catalytic activity which may be engaged in or facilitated by the SLC7 RNA.

[0060] In one embodiment, the antisense oligomer is an oligonucleotide that is sufficiently complementary to an SLC7 mRNA to bind to and prevent translation, preferably by binding to the 5′ untranslated region. SLC7-specific antisense oligonucleotides, preferably range from at least 6 to about 200 nucleotides. In some embodiments the oligonucleotide is preferably at least 10, 15, or 20 nucleotides in length. In other embodiments, the oligonucleotide is preferably less than 50, 40, or 30 nucleotides in length. The oligonucleotide can be DNA or RNA or a chimeric mixture or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone. The oligonucleotide may include other appending groups such as peptides, agents that facilitate transport across the cell membrane, hybridization-triggered cleavage agents, and intercalating agents.

[0061] In another embodiment, the antisense oligomer is a phosphothioate morpholino oligomer (PMO). PMOs are assembled from four different morpholino subunits, each of which contain one of four genetic bases (A, C, G, or T) linked to a six-membered morpholine ring. Polymers of these subunits are joined by non-ionic phosphodiamidate intersubunit linkages. Details of how to make and use PMOs and other antisense oligomers are well known in the art (e.g. see WO99/18193; Probst J C, Antisense Oligodeoxynucleotide and Ribozyme Design, Methods. (2000) 22(3):271-281; Summerton J, and Weller D. 1997 Antisense Nucleic Acid Drug Dev. :7:187-95; U.S. Pat. No. 5,235,033; and U.S. Pat. No. 5,378,841).

[0062] Alternative preferred SLC7 nucleic acid modulators are double-stranded RNA species mediating RNA interference (RNAi). RNAi is the process of sequence-specific, post-transcriptional gene silencing in animals and plants, initiated by double-stranded RNA (dsRNA) that is homologous in sequence to the silenced gene. Methods relating to the use of RNAi to silence genes in C. elegans, Drosophila, plants, and humans are known in the art (Fire A, et al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363 (1999); Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490 (2001); Hammond, S. M., et al., Nature Rev. Genet. 2, 110-1119 (2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001); Hamilton, A. et al., Science 286, 950-952 (1999); Hammond, S. M., et al., Nature 404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33 (2000); Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S. M., et al., Genes Dev. 15, 188-200 (2001); WO0129058; WO9932619; Elbashir S M, et al., 2001 Nature 411:494-498).

[0063] Nucleic acid modulators are commonly used as research reagents, diagnostics, and therapeutics. For example, antisense oligonucleotides, which are able to inhibit gene expression with exquisite specificity, are often used to elucidate the function of particular genes (see, for example, U.S. Pat. No. 6,165,790). Nucleic acid modulators are also used, for example, to distinguish between functions of various members of a biological pathway. For example, antisense oligomers have been employed as therapeutic moieties in the treatment of disease states in animals and man and have been demonstrated in numerous clinical trials to be safe and effective (Milligan J F, et al, Current Concepts in Antisense Drug Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L et al., Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents, Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of the invention, an SLC7-specific nucleic acid modulator is used in an assay to further elucidate the role of the SLC7 in the p53 pathway, and/or its relationship to other members of the pathway. In another aspect of the invention, an SLC7-specific antisense oligomer is used as a therapeutic agent for treatment of p53-related disease states.

[0064] Assay Systems

[0065] The invention provides assay systems and screening methods for identifying specific modulators of SLC7 activity. As used herein, an “assay system” encompasses all the components required for performing and analyzing results of an assay that detects and/or measures a particular event. In general, primary assays are used to identify or confirm a modulator's specific biochemical or molecular effect with respect to the SLC7 nucleic acid or protein. In general, secondary assays further assess the activity of a SLC7 modulating agent identified by a primary assay and may confirm that the modulating agent affects SLC7 in a manner relevant to the p53 pathway. In some cases, SLC7 modulators will be directly tested in a secondary assay.

[0066] In a preferred embodiment, the screening method comprises contacting a suitable assay system comprising an SLC7 polypeptide or nucleic acid with a candidate agent under conditions whereby, but for the presence of the agent, the system provides a reference activity (e.g. transporter activity), which is based on the particular molecular event the screening method detects. A statistically significant difference between the agent-biased activity and the reference activity indicates that the candidate agent modulates SLC7 activity, and hence the p53 pathway. The SLC7 polypeptide or nucleic acid used in the assay may comprise any of the nucleic acids or polypeptides described above (e.g. SEQ ID NOs 1-54). In one preferred embodiment, the SLC7 is an SLC7A5, comprising a nucleic acid sequence selected from any one of SEQ ID NOs 1-6, and 51, or an amino acid sequence selected from any one of SEQ ID NOs 37, 38, and 54. In a further preferred embodiment, the SLC7A5 nucleic acid comprises SEQ ID NO: 51, and the protein comprises SEQ ID NO: 54. In another preferred embodiment, the SLC7 is an SLC7A11 comprising a nucleic acid sequence selected from any one of SEQ ID NOs 29-34, 52, and 53, or an amino acid sequence selected from any one of 48, and 49. In a further preferred embodiment, the SLC7A11 nucleic acid comprises SEQ ID NO: 52, or its splice-variant, SEQ ID NO: 53.

Primary Assays

[0067] The type of modulator tested generally determines the type of primary assay.

Primary Assays for Small Molecule Modulators

[0068] For small molecule modulators, screening assays are used to identify candidate modulators. Screening assays may be cell-based or may use a cell-free system that recreates or retains the relevant biochemical reaction of the target protein (reviewed in Sittampalam GS et al., Curr Opin Chem Biol (1997) 1:384-91 and accompanying references). As used herein the term “cell-based” refers to assays using live cells, dead cells, or a particular cellular fraction, such as a membrane, endoplasmic reticulum, or mitochondrial fraction. The term “cell free” encompasses assays using substantially purified protein (either endogenous or recombinantly produced), partially purified or crude cellular extracts. Screening assays may detect a variety of molecular events, including protein-DNA interactions, protein-protein interactions (e.g., receptor-ligand binding), transcriptional activity (e.g., using a reporter gene), enzymatic activity (e.g., via a property of the substrate), activity of second messengers, immunogenicty and changes in cellular morphology or other cellular characteristics. Appropriate screening assays may use a wide range of detection methods including fluorescent, radioactive, calorimetric, spectrophotometric, and amperometric methods, to provide a read-out for the particular molecular event detected.

[0069] Cell-based screening assays usually require systems for recombinant expression of SLC7 and any auxiliary proteins demanded by the particular assay. Appropriate methods for generating recombinant proteins produce sufficient quantities of proteins that retain their relevant biological activities and are of sufficient purity to optimize activity and assure assay reproducibility. Yeast two-hybrid and variant screens, and mass spectrometry provide preferred methods for determining protein-protein interactions and elucidation of protein complexes. In certain applications, when SLC7-interacting proteins are used in screens to identify small molecule modulators, the binding specificity of the interacting protein to the SLC7 protein may be assayed by various known methods such as substrate processing (e.g. ability of the candidate SLC7-specific binding agents to function as negative effectors in SLC7-expressing cells), binding equilibrium constants (usually at least about 10⁷ M⁻¹, preferably at least about 10⁸ M⁻¹, more preferably at least about 10⁹ M⁻¹), and immunogenicity (e.g. ability to elicit SLC7 specific antibody in a heterologous host such as a mouse, rat, goat or rabbit). For enzymes and receptors, binding may be assayed by, respectively, substrate and ligand processing.

[0070] The screening assay may measure a candidate agent's ability to specifically bind to or modulate activity of a SLC7 polypeptide, a fusion protein thereof, or to cells or membranes bearing the polypeptide or fusion protein. The SLC7 polypeptide can be full length or a fragment thereof that retains functional SLC7 activity. The SLC7 polypeptide may be fused to another polypeptide, such as a peptide tag for detection or anchoring, or to another tag. The SLC7 polypeptide is preferably human SLC7, or is an ortholog or derivative thereof as described above. In a preferred embodiment, the screening assay detects candidate agent-based modulation of SLC7 interaction with a binding target, such as an endogenous or exogenous protein or other substrate that has SLC7-specific binding activity, and can be used to assess normal SLC7 gene function.

[0071] Suitable assay formats that may be adapted to screen for SLC7 modulators are known in the art. Preferred screening assays are high throughput or ultra high throughput and thus provide automated, cost-effective means of screening compound libraries for lead compounds (Fernandes PB, Curr Opin Chem Biol (1998) 2:597-603; Sundberg S A, Curr Opin Biotechnol 2000, 11:47-53). In one preferred embodiment, screening assays uses fluorescence technologies, including fluorescence polarization, time-resolved fluorescence, and fluorescence resonance energy transfer. These systems offer means to monitor protein-protein or DNA-protein interactions in which the intensity of the signal emitted from dye-labeled molecules depends upon their interactions with partner molecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4; Fernandes P B, supra; Hertzberg R P and Pope A J, Curr Opin Chem Biol (2000) 4:445-451).

[0072] A variety of suitable assay systems may be used to identify candidate SLC7 and p53 pathway modulators (e.g. U.S. Pat. No. 6,020,135 (p53 modulation), and U.S. Pat. Nos. 5,550,019 and 6,133,437 (apoptosis assays), among others). Specific preferred assays are described in more detail below.

[0073] Transporter assays. Transporter proteins carry a range of substrates, including nutrients, ions, amino acids, and drugs, across cell membranes. Assays for modulators of transporters may use labeled substrates. For instance, exemplary high throughput screens to identify compounds that interact with different peptide and anion transporters both use fluorescently labeled substrates; the assay for peptide transport additionally uses multiscreen filtration plates (Blevitt J M et al., J Biomol Screen 1999, 4:87-91; Cihlar T and Ho ES, Anal Biochem 2000, 283:49-55).

[0074] Apoptosis assays. Assays for apoptosis may be performed by terminal deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nick end labeling (TUNEL) assay. The TUNEL assay is used to measure nuclear DNA fragmentation characteristic of apoptosis (Lazebnik et al., 1994, Nature 371, 346), by following the incorporation of fluorescein-dUTP (Yonehara et al., 1989, J. Exp. Med. 169, 1747). Apoptosis may further be assayed by acridine orange staining of tissue culture cells (Lucas, R., et al., 1998, Blood 15:4730-41). An apoptosis assay system may comprise a cell that expresses an SLC7, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the apoptosis assay system and changes in induction of apoptosis relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, an apoptosis assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using a cell-free assay system. An apoptosis assay may also be used to test whether SLC7 function plays a direct role in apoptosis. For example, an apoptosis assay may be performed on cells that over- or under-express SLC7 relative to wild type cells. Differences in apoptotic response compared to wild type cells suggests that the SLC7 plays a direct role in the apoptotic response. Apoptosis assays are described further in U.S. Pat. No. 6,133,437.

[0075] Cell proliferation and cell cycle assays. Cell proliferation may be assayed via bromodeoxyuridine (BRDU) incorporation. This assay identifies a cell population undergoing DNA synthesis by incorporation of BRDU into newly-synthesized DNA. Newly-synthesized DNA may then be detected using an anti-BRDU antibody (Hoshino et al., 1986, Int. J. Cancer 38, 369; Campana et al., 1988, J. Immunol. Meth. 107, 79), or by other means.

[0076] Cell Proliferation may also be examined using [³H]-thymidine incorporation (Chen, J., 1996, Oncogene 13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This assay allows for quantitative characterization of S-phase DNA syntheses. In this assay, cells synthesizing DNA will incorporate [³H]-thymidine into newly synthesized DNA. Incorporation can then be measured by standard techniques such as by counting of radioisotope in a scintillation counter (e.g., Beckman L S 3800 Liquid Scintillation Counter).

[0077] Cell proliferation may also be assayed by colony formation in soft agar (Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). For example, cells transformed with SLC7 are seeded in soft agar plates, and colonies are measured and counted after two weeks incubation.

[0078] Involvement of a gene in the cell cycle may be assayed by flow cytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud Phys Chem Med 49:237-55). Cells transfected with an SLC7 may be stained with propidium iodide and evaluated in a flow cytometer (available from Becton Dickinson).

[0079] Accordingly, a cell proliferation or cell cycle assay system may comprise a cell that expresses an SLC7, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the assay system and changes in cell proliferation or cell cycle relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the cell proliferation or cell cycle assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system such as a cell-free kinase assay system. A cell proliferation assay may also be used to test whether SLC7 function plays a direct role in cell proliferation or cell cycle. For example, a cell proliferation or cell cycle assay may be performed on cells that over- or under-express SLC7 relative to wild type cells. Differences in proliferation or cell cycle compared to wild type cells suggests that the SLC7 plays a direct role in cell proliferation or cell cycle.

[0080] Angiogenesis. Angiogenesis may be assayed using various human endothelial cell systems, such as umbilical vein, coronary artery, or dermal cells. Suitable assays include Alamar Blue based assays (available from Biosource International) to measure proliferation; migration assays using fluorescent molecules, such as the use of Becton Dickinson Falcon HTS FluoroBlock cell culture inserts to measure migration of cells through membranes in presence or absence of angiogenesis enhancer or suppressors; and tubule formation assays based on the formation of tubular structures by endothelial cells on Matrigel® (Becton Dickinson). Accordingly, an angiogenesis assay system may comprise a cell that expresses an SLC7, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the angiogenesis assay system and changes in angiogenesis relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the angiogenesis assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system. An angiogenesis assay may also be used to test whether SLC7 function plays a direct role in cell proliferation. For example, an angiogenesis assay may be performed on cells that over- or under-express SLC7 relative to wild type cells. Differences in angiogenesis compared to wild type cells suggests that the SLC7 plays a direct role in angiogenesis.

[0081] Hypoxic induction. The alpha subunit of the transcription factor, hypoxia inducible factor-1 (HIF-1), is upregulated in tumor cells following exposure to hypoxia in vitro. Under hypoxic conditions, HFM-1 stimulates the expression of genes known to be important in tumour cell survival, such as those encoding glyolytic enzymes and VEGF. Induction of such genes by hypoxic conditions may be assayed by growing cells transfected with SLC7 in hypoxic conditions (such as with 0.1% O2, 5% CO2, and balance N2, generated in a Napco 7001 incubator (Precision Scientific)) and normoxic conditions, followed by assessment of gene activity or expression by Taqman®. For example, a hypoxic induction assay system may comprise a cell that expresses an SLC7, and that optionally has a mutated p53 (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the hypoxic induction assay system and changes in hypoxic response relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the hypoxic induction assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system. A hypoxic induction assay may also be used to test whether SLC7 function plays a direct role in the hypoxic response. For example, a hypoxic induction assay may be performed on cells that over- or under-express SLC7 relative to wild type cells. Differences in hypoxic response compared to wild type cells suggests that the SLC7 plays a direct role in hypoxic induction.

[0082] Cell adhesion. Cell adhesion assays measure adhesion of cells to purified adhesion proteins, or adhesion of cells to each other, in presence or absence of candidate modulating agents. Cell-protein adhesion assays measure the ability of agents to modulate the adhesion of cells to purified proteins. For example, recombinant proteins are produced, diluted to 2.5 g/mL in PBS, and used to coat the wells of a microtiter plate. The wells used for negative control are not coated. Coated wells are then washed, blocked with 1% BSA, and washed again. Compounds are diluted to 2× final test concentration and added to the blocked, coated wells. Cells are then added to the wells, and the unbound cells are washed off. Retained cells are labeled directly on the plate by adding a membrane-permeable fluorescent dye, such as calcein-AM, and the signal is quantified in a fluorescent microplate reader.

[0083] Cell-cell adhesion assays measure the ability of agents to modulate binding of cell adhesion proteins with their native ligands. These assays use cells that naturally or recombinantly express the adhesion protein of choice. In an exemplary assay, cells expressing the cell adhesion protein are plated in wells of a multiwell plate. Cells expressing the ligand are labeled with a membrane-permeable fluorescent dye, such as BCECF, and allowed to adhere to the monolayers in the presence of candidate agents. Unbound cells are washed off, and bound cells are detected using a fluorescence plate reader.

[0084] High-throughput cell adhesion assays have also been described. In one such assay, small molecule ligands and peptides are bound to the surface of microscope slides using a microarray spotter, intact cells are then contacted with the slides, and unbound cells are washed off. In this assay, not only the binding specificity of the peptides and modulators against cell lines are determined, but also the functional cell signaling of attached cells using immunofluorescence techniques in situ on the microchip is measured (Falsey J R et al., Bioconjug Chem. May-June 2001; 12(3):346-53).

Primary Assays for Antibody Modulators

[0085] For antibody modulators, appropriate primary assays test is a binding assay that tests the antibody's affinity to and specificity for the SLC7 protein. Methods for testing antibody affinity and specificity are well known in the art (Harlow and Lane, 1988, 1999, supra). The enzyme-linked immunosorbant assay (ELISA) is a preferred method for detecting SLC7-specific antibodies; others include FACS assays, radioimmunoassays, and fluorescent assays.

Primary Assays for Nucleic Acid Modulators

[0086] For nucleic acid modulators, primary assays may test the ability of the nucleic acid modulator to inhibit or enhance SLC7 gene expression, preferably mRNA expression. In general, expression analysis comprises comparing SLC7 expression in like populations of cells (e.g., two pools of cells that endogenously or recombinantly express SLC7) in the presence and absence of the nucleic acid modulator. Methods for analyzing MRNA and protein expression are well known in the art. For instance, Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR (e.g., using the TaqMan®, PE Applied Biosystems), or microarray analysis may be used to confirm that SLC7 mRNA expression is reduced in cells treated with the nucleic acid modulator (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm D H and Guiseppi-Elie, A Curr Opin Biotechnol 2001, 12:41-47). Protein expression may also be monitored. Proteins are most commonly detected with specific antibodies or antisera directed against either the SLC7 protein or specific peptides. A variety of means including Western blotting, ELISA, or in situ detection, are available (Harlow E and Lane D, 1988 and 1999, supra).

Secondary Assays

[0087] Secondary assays may be used to further assess the activity of SLC7-modulating agent identified by any of the above methods to confirm that the modulating agent affects SLC7 in a manner relevant to the p53 pathway. As used herein, SLC7-modulating agents encompass candidate clinical compounds or other agents derived from previously identified modulating agent. Secondary assays can also be used to test the activity of a modulating agent on a particular genetic or biochemical pathway or to test the specificity of the modulating agent's interaction with SLC7.

[0088] Secondary assays generally compare like populations of cells or animals (e.g., two pools of cells or animals that endogenously or recombinantly express SLC7) in the presence and absence of the candidate modulator. In general, such assays test whether treatment of cells or animals with a candidate SLC7-modulating agent results in changes in the p53 pathway in comparison to untreated (or mock- or placebo-treated) cells or animals. Certain assays use “sensitized genetic backgrounds”, which, as used herein, describe cells or animals engineered for altered expression of genes in the p53 or interacting pathways.

Cell-based Assays

[0089] Cell based assays may use a variety of mammalian cell lines known to have defective p53 function (e.g. SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29 and DLD-1 colon cancer cells, among others, available from American Type Culture Collection (ATCC), Manassas, Va.). Cell based assays may detect endogenous p53 pathway activity or may rely on recombinant expression of p53 pathway components. Any of the aforementioned assays may be used in this cell-based format. Candidate modulators are typically added to the cell media but may also be injected into cells or delivered by any other efficacious means.

Animal Assays

[0090] A variety of non-human animal models of normal or defective p53 pathway may be used to test candidate SLC7 modulators. Models for defective p53 pathway typically use genetically modified animals that have been engineered to mis-express (e.g., over-express or lack expression in) genes involved in the p53 pathway. Assays generally require systemic delivery of the candidate modulators, such as by oral administration, injection, etc.

[0091] In a preferred embodiment, p53 pathway activity is assessed by monitoring neovascularization and angiogenesis. Animal models with defective and normal p53 are used to test the candidate modulator's affect on SLC7 in Matrigel® assays. Matrigel® is an extract of basement membrane proteins, and is composed primarily of laminin, collagen IV, and heparin sulfate proteoglycan. It is provided as a sterile liquid at 4° C., but rapidly forms a solid gel at 37° C. Liquid Matrigel® is mixed with various angiogenic agents, such as bFGF and VEGF, or with human tumor cells which over-express the SLC7. The mixture is then injected subcutaneously(SC) into female athymic nude mice (Taconic, Germantown, N.Y.) to support an intense vascular response. Mice with Matrigel® pellets may be dosed via oral (PO), intraperitoneal (IP), or intravenous (IV) routes with the candidate modulator. Mice are euthanized 5-12 days post-injection, and the Matrigel® pellet is harvested for hemoglobin analysis (Sigma plasma hemoglobin kit). Hemoglobin content of the gel is found to correlate the degree of neovascularization in the gel.

[0092] In another preferred embodiment, the effect of the candidate modulator on SLC7 is assessed via tumorigenicity assays. In one example, xenograft human tumors are implanted SC into female athymic mice, 6-7 week old, as single cell suspensions either from a pre-existing tumor or from in vitro culture. The tumors which express the SLC7 endogenously are injected in the flank, 1×10⁵ to 1×10⁷ cells per mouse in a volume of 100 μL using a 27gauge needle. Mice are then ear tagged and tumors are measured twice weekly. Candidate modulator treatment is initiated on the day the mean tumor weight reaches 100 mg. Candidate modulator is delivered IV, SC, IP, or PO by bolus administration. Depending upon the pharmacokinetics of each unique candidate modulator, dosing can be performed multiple times per day. The tumor weight is assessed by measuring perpendicular diameters with a caliper and calculated by multiplying the measurements of diameters in two dimensions. At the end of the experiment, the excised tumors maybe utilized for biomarker identification or further analyses. For immunohistochemistry staining, xenograft tumors are fixed in 4% paraformaldehyde, 0.1M phosphate, pH 7.2, for 6 hours at 4° C., immersed in 30% sucrose in PBS, and rapidly frozen in isopentane cooled with liquid nitrogen.

[0093] Diagnostic and Therapeutic Uses

[0094] Specific SLC7-modulating agents are useful in a variety of diagnostic and therapeutic applications where disease or disease prognosis is related to defects in the p53 pathway, such as angiogenic, apoptotic, or cell proliferation disorders. Accordingly, the invention also provides methods for modulating the p53 pathway in a cell, preferably a cell predetermined to have defective or impaired p53 function (e.g. due to overexpression, underexpression, or misexpression of p53, or due to gene mutations), comprising the step of administering an agent to the cell that specifically modulates SLC7 activity. Preferably, the modulating agent produces a detectable phenotypic change in the cell indicating that the p53 function is restored. The phrase “function is restored”, and equivalents, as used herein, means that the desired phenotype is achieved, or is brought closer to normal compared to untreated cells. For example, with restored p53 function, cell proliferation and/or progression through cell cycle may normalize, or be brought closer to normal relative to untreated cells. The invention also provides methods for treating disorders or disease associated with impaired p53 function by administering a therapeutically effective amount of an SLC7-modulating agent that modulates the p53 pathway. The invention further provides methods for modulating SLC7 function in a cell, preferably a cell pre-determined to have defective or impaired SLC7 function, by administering an SLC7-modulating agent. Additionally, the invention provides a method for treating disorders or disease associated with impaired SLC7 function by administering a therapeutically effective amount of an SLC7-modulating agent. In certain embodiments the impaired SLC7 function is attributable to impaired SLC7A5 or SLC7A11.

[0095] The discovery that SLC7 is implicated in p53 pathway provides for a variety of methods that can be employed for the diagnostic and prognostic evaluation of diseases and disorders involving defects in the p53 pathway and for the identification of subjects having a predisposition to such diseases and disorders.

[0096] Various expression analysis methods can be used to diagnose whether SLC7 expression occurs in a particular sample, including Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR, and microarray analysis. (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm and Guiseppi-Elie, Curr Opin Biotechnol 2001, 12:41-47). Tissues having a disease or disorder implicating defective p53 signaling that express an SLC7, are identified as amenable to treatment with an SLC7 modulating agent. In a preferred application, the p53 defective tissue overexpresses an SLC7 relative to normal tissue. For example, a Northern blot analysis of mRNA from tumor and normal cell lines, or from tumor and matching normal tissue samples from the same patient, using full or partial SLC7 cDNA sequences as probes, can determine whether particular tumors express or overexpress SLC7. Alternatively, the TaqMan® is used for quantitative RT-PCR analysis of SLC7 expression in cell lines, normal tissues and tumor samples (PE Applied Biosystems).

[0097] Various other diagnostic methods may be performed, for example, utilizing reagents such as the SLC7 oligonucleotides, and antibodies directed against an SLC7, as described above for: (1) the detection of the presence of SLC7 gene mutations, or the detection of either over- or under-expression of SLC7 MRNA relative to the non-disorder state; (2) the detection of either an over- or an under-abundance of SLC7 gene product relative to the non-disorder state; and (3) the detection of perturbations or abnormalities in the signal transduction pathway mediated by SLC7.

[0098] Thus, in a specific embodiment, the invention is drawn to a method for diagnosing a disease or disorder in a patient that is associated with alterations in SLC7 expression, the method comprising: a) obtaining a biological sample from the patient; b) contacting the sample with a probe for SLC7 expression; c) comparing results from step (b) with a control; and d) determining whether step (c) indicates a likelihood of the disease or disorder. Preferably, the disease is cancer, most preferably a cancer as shown in TABLE 1. The probe may be either DNA or protein, including an antibody.

EXAMPLES

[0099] The following experimental section and examples are offered by way of illustration and not by way of limitation.

I. Drosophila p53 Screen

[0100] The Drosophila p53 gene was overexpressed specifically in the wing using the vestigial margin quadrant enhancer. Increasing quantities of Drosophila p53 (titrated using different strength transgenic inserts in 1 or 2 copies) caused deterioration of normal wing morphology from mild to strong, with phenotypes including disruption of pattern and polarity of wing hairs, shortening and thickening of wing veins, progressive crumpling of the wing and appearance of dark “death” inclusions in wing blade. In a screen designed to identify enhancers and suppressors of Drosophila p53, homozygous females carrying two copies of p53 were crossed to 5663 males carrying random insertions of a piggyBac transposon (Fraser M et al., Virology (1985) 145:356-361). Progeny containing insertions were compared to non-insertion-bearing sibling progeny for enhancement or suppression of the p53 phenotypes. Sequence information surrounding the piggyBac insertion site was used to identify the modifier genes. Modifiers of the wing phenotype were identified as members of the p53 pathway. CG1607 was an enhancer of the wing phenotype. CG1607 was an enhacer of the wing phenotype. Human orthologs of the modifiers, are referred to herein as SLC7.

[0101] BLAST analysis (Altschul et al., supra) was employed to identify Targets from Drosophila modifiers. For example, representative sequences from SLC7A, GI#s4519803, 4507053, 4507055, 14751168, 7657591, 9790235, 7657683, and 13654280 (SEQ ID NOs: 38, 40, 42, 44, 46, 47, 49, and 50, respectively) share 49%, 48%, 47%, 49%, 42%, 43%, 44%, and 57 amino acid identity, respectively, with the Drosophila CG1607.

[0102] Various domains, signals, and functional subunits in proteins were analyzed using the PSORT (Nakai K., and Horton P., Trends Biochem Sci, 1999, 24:34-6; Kenta Nakai, Protein sorting signals and prediction of subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)), PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2; http:Hlpfam.wustl.edu), SMART (Ponting CP, et al., SMART: identification and annotation of domains from signaling and extracellular protein sequences. Nucleic Acids Res. Jan. 1, 1999 ;27(1):229-32), TM-HMM (Erik L. L. Sonnhammer, Gunnar von Heijne, and Anders Krogh: A hidden Markov model for predicting transmembrane helices in protein sequences. In Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular Biology, p 175-182 Ed J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen Menlo Park, CA: AAAI Press, 1998), and dust (Remm M, and Sonnhammer E. Classification of transmembrane protein families in the Caenorhabditis elegans genome and identification of human orthologs. Genome Res. November 2000; 10(11): 1679-89) programs. For example, using PFAM, permease domains of representative sequences from SLC7 (GI#4519803, SEQ ID NO: 38), (GI#4507053, SEQ ID NO: 40), (GI#4507055, SEQ ID NO: 42), (GI#14751168, SEQ ID NO: 44), (GI#7657591, SEQ ID NO: 46), (GI#9790235, SEQ ID NO: 47), and (GI#7657683, SEQ ID NO: 49), are located at amino acids 46-481, 41-467, 33-468, 36-474, 26-462, 36-471, and 40-475, respectively.

II. HighThroughput In Vitro Fluorescence Polarization Assay

[0103] Fluorescently-labeled SLC7 peptide/substrate are added to each well of a 96-well microtiter plate, along with a test agent in a test buffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH 7.6). Changes in fluorescence polarization, determined by using a Fluorolite FPM-2 Fluorescence Polarization Microtiter System (Dynatech Laboratories, Inc), relative to control values indicates the test compound is a candidate modifier of SLC7 activity.

III. High-Throughput In Vitro Binding Assay

[0104]³³P-labeled SLC7 peptide is added in an assay buffer (100 mM KCl, 20 mM HEPES pH 7.6, 1 mM MgCl₂, 1% glycerol, 0.5% NP-40, 50 mM beta-mercaptoethanol, 1 mg/ml BSA, cocktail of protease inhibitors) along with a test agent to the wells of a Neutralite-avidin coated assay plate and incubated at 25° C. for 1 hour. Biotinylated substrate is then added to each well and incubated for 1 hour. Reactions are stopped by washing with PBS, and counted in a scintillation counter. Test agents that cause a difference in activity relative to control without test agent are identified as candidate p53 modulating agents.

IV. Immunoprecipitations and Immunoblotting

[0105] For coprecipitation of transfected proteins, 3×10⁶ appropriate recombinant cells containing the SLC7 proteins are plated on 10-cm dishes and transfected on the following day with expression constructs. The total amount of DNA is kept constant in each transfection by adding empty vector. After 24 h, cells are collected, washed once with phosphate-buffered saline and lysed for 20 min on ice in 1 ml of lysis buffer containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20 mM -glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenyl phosphate, 2 mM dithiothreitol, protease inhibitors (complete, Roche Molecular Biochemicals), and 1% Nonidet P-40. Cellular debris is removed by centrifugation twice at 15,000×g for 15 min. The cell lysate is incubated with 25 μl of M2 beads (Sigma) for 2 h at 4 ° C. with gentle rocking.

[0106] After extensive washing with lysis buffer, proteins bound to the beads are solubilized by boiling in SDS sample buffer, fractionated by SDS-polyacrylamide gel electrophoresis, transferred to polyvinylidene difluoride membrane and blotted with the indicated antibodies. The reactive bands are visualized with horseradish peroxidase coupled to the appropriate secondary antibodies and the enhanced chemiluminescence (ECL) Western blotting detection system (Amersham Pharmacia Biotech).

V. Expression Analysis

[0107] All cell lines used in the following experiments are NCI (National Cancer Institute) lines, and are available from ATCC (American Type Culture Collection, Manassas, Va. 20110-2209). Normal and tumor tissues were obtained from Impath, UC Davis, Clontech, Stratagene, and Ambion.

[0108] TaqMan analysis was used to assess expression levels of the disclosed genes in various samples.

[0109] RNA was extracted from each tissue sample using Qiagen (Valencia, Calif.) RNeasy kits, following manufacturer's protocols, to a final concentration of 50 ng/μl. Single stranded cDNA was then synthesized by reverse transcribing the RNA samples using random hexamers and 500 ng of total RNA per reaction, following protocol 4304965 of Applied Biosystems (Foster City, Calif., http://www.appliedbiosystems.com/).

[0110] Primers for expression analysis using TaqMan assay (Applied Biosystems, Foster City, Calif.) were prepared according to the TaqMan protocols, and the following criteria:

[0111] a) primer pairs were designed to span introns to eliminate genomic contamination, and

[0112] b) each primer pair produced only one product.

[0113] Taqman reactions were carried out following manufacturer's protocols, in 25 μl total volume for 96-well plates and 10 μl total volume for 384-well plates, using 300 nM primer and 250 nM probe, and approximately 25 ng of cDNA. The standard curve for result analysis was prepared using a universal pool of human cDNA samples, which is a mixture of cDNAs from a wide variety of tissues so that the chance that a target will be present in appreciable amounts is good. The raw data were normalized using 18S rRNA (universally expressed in all tissues and cells).

[0114] For each expression analysis, tumor tissue samples were compared with matched normal tissues from the same patient. A gene was considered overexpressed in a tumor when the level of expression of the gene was 2 fold or higher in the tumor compared with its matched normal sample. In cases where normal tissue was not available, a universal pool of cDNA samples was used instead. In these cases, a gene was considered overexpressed in a tumor sample when the difference of expression levels between a tumor sample and the average of all normal samples from the same tissue type was greater than 2 times the standard deviation of all normal samples (i.e., Tumor−average(all normal samples)>2×STDEV(all normal samples) ).

[0115] Results are shown in Table 1. Data presented in bold indicate that greater than 50% of tested tumor samples of the tissue type indicated in row 1 exhibited over expression of the gene listed in column 1, relative to normal samples. Underlined data indicates that between 25% to 49% of tested tumor samples exhibited over expression. A modulator identified by an assay described herein can be further validated for therapeutic effect by administration to a tumor in which the gene is overexpressed. A decrease in tumor growth confirms therapeutic utility of the modulator. Prior to treating a patient with the modulator, the likelihood that the patient will respond to treatment can be diagnosed by obtaining a tumor sample from the patient, and assaying for expression of the gene targeted by the modulator. The expression data for the gene(s) can also be used as a diagnostic marker for disease progression. The assay can be performed by expression analysis as described above, by antibody directed to the gene target, or by any other available detection method. TABLE 1 breast . . colon . . kidney . . lung . . ovary GI#9790234 (SEQ ID NO:26) 1 3 . 7 26 . 5 19 . 6 14 . 1 4 GI#7657682 (SEQ ID NO:30) 1 3 . 16 26 . 14 19 . 12 14 . 2 4 GI#13649338 (SEQ ID NO:1) 3 3 . 21 26 . 8 19 . 10 14 . 1 4 GI#13647529 (SEQ ID NO:8) 1 3 . 9 26 . 1 19 . 1 14 . 0 4 GI#13111751 (SEQ ID NO:11) 0 3 . 11 26 . 4 19 . 0 14 . 1 4 GI#6642959 (SEQ ID NO:14) 1 3 . 8 26 . 1 19 . 3 14 . 0 4 GI#14763953 (SEQ ID NO:22) 1 3 . 6 26 . 3 19 . 3 14 . 3 4

[0116]

1 54 1 1593 DNA Homo sapiens 1 ggccggtgcg cagagcatgg cgggtgcggg cccgaaagcg gcgcgcgcta gcggcgccgg 60 cggccgagaa aaagaaagag gcccgggaga agatgctggc cgccaagagc gcggacggct 120 cggcgccggc aggcgagggc gagggcgtga ccctgcagcg gaacatcacg ctgctcaacg 180 gcgtggccat catcgtgggg accattatcg gctcgggcat cttcgtgacg cccacgggcg 240 tgctcaagga ggcaggctcg ccggggctgg cgctggtggt gtgggccgcg tgcggcgtct 300 tctccatcgt gggcgcgctc tgctacgcgg agctcggcac caccatctcc aaatcgggcg 360 gcgactacgc ctacatgctg gaggtctacg gctcgctgcc cgccttcctc aagctctgga 420 tcgagctgct catcatccgg ccttcatcgc agtacatcgt ggccctggtc ttcgccacct 480 acctgctcaa gccgctcttc cccacctgcc cggtgcccga ggaggcagcc aagctcgtgg 540 cctgcctctg cgtgctgctg ctcacggccg tgaactgcta cagcgtgaag gccgccaccc 600 gggtccagga tgcctttgcc gccgccaagc tcctggccct ggccctgatc atcctgctgg 660 gcttcgtcca gatcgggaag ggtgatgtgt ccaatctaga tcccaacttc tcatttgaag 720 gcaccaaact ggatgtgggg aacattgtgc tggcattata cagcggcctc tttgcctatg 780 gaggatggaa ttacttgaat ttcgtcacag aggaaatgat caacccctac agaaacctgc 840 ccctggccat catcatctcc ctgcccatcg tgacgctggt gtacgtgctg accaacctgg 900 cctacttcac caccctgtcc accgagcaga tgctgtcgtc cgaggccgtg gccgtggact 960 tcgggaacta tcacctgggc gtcatgtcct ggatcatccc cgtcttcgtg ggcctgtcct 1020 gcttcggctc cgtcaatggg tccctgttca catcctccag gctcttcttc gtggggtccc 1080 gggaaggcca cctgccctcc atcctctcca tgatccaccc acagctcctc acccccgtgc 1140 cgtccctcgt gttcacgtgt gtgatgacgc tgctctacgc cttctccaag gacatcttct 1200 ccgtcatcaa cttcttcagc ttcttcaact ggctctgcgt ggccctggcc atcatcggca 1260 tgatctggct gcgccacaga aagcctgagc ttgagcggcc catcaaggtg aacctggccc 1320 tgcctgtgtt cttcatcctg gcctgcctct tcctgatcgc cgtctccttc tggaagacac 1380 ccgtggagtg tggcatcggc ttcaccatca tcctcagcgg gctgcccgtc tacttcttcg 1440 gggtctggtg gaaaaacaag cccaagtggc tcctccaggg catcttctcc acgaccgtcc 1500 tgtgtcagaa gctcatgcag gtggtccccc aggagacata gccaggaggc cgagtggctg 1560 ccggaggagc atgcgcagag gccagttaaa gta 1593 2 4539 DNA Homo sapiens 2 cggcgcgcac actgctcgct gggccgcggc tcccgggtgt cccaggcccg gccggtgcgc 60 agagcatggc gggtgcgggc ccgaagcggc gcgcgctagc ggcgccggcg gccgaggaga 120 aggaagaggc gcgggagaag atgctggccg ccaagagcgc ggacggctcg gcgccggcag 180 gcgagggcga gggcgtgacc ctgcagcgga acatcacgct gctcaacggc gtggccatca 240 tcgtggggac cattatcggc tcgggcatct tcgtgacgcc cacgggcgtg ctcaaggagg 300 caggctcgcc ggggctggcg ctggtggtgt gggccgcgtg cggcgtcttc tccatcgtgg 360 gcgcgctctg ctacgcggag ctcggcacca ccatctccaa atcgggcggc gactacgcct 420 acatgctgga ggtctacggc tcgctgcccg ccttcctcaa gctctggatc gagctgctca 480 tcatccggcc ttcatcgcag tacatcgtgg ccctggtctt cgccacctac ctgctcaagc 540 cgctcttccc cacctgcccg gtgcccgagg aggcagccaa gctcgtggcc tgcctctgcg 600 tgctgctgct cacggccgtg aactgctaca gcgtgaaggc cgccacccgg gtccaggatg 660 cctttgccgc cgccaagctc ctggccctgg ccctgatcat cctgctgggc ttcgtccaga 720 tcgggaaggg tgatgtgtcc aatctagatc ccaacttctc atttgaaggc accaaactgg 780 atgtggggaa cattgtgctg gcattataca gcggcctctt tgcctatgga ggatggaatt 840 acttgaattt cgtcacagag gaaatgatca acccctacag aaacctgccc ctggccatca 900 tcatctccct gcccatcgtg acgctggtgt acgtgctgac caacctggcc tacttcacca 960 ccctgtccac cgagcagatg ctgtcgtccg aggccgtggc cgtggacttc gggaactatc 1020 acctgggcgt catgtcctgg atcatccccg tcttcgtggg cctgtcctgc ttcggctccg 1080 tcaatgggtc cctgttcaca tcctccaggc tcttcttcgt ggggtcccgg gaaggccacc 1140 tgccctccat cctctccatg atccacccac agctcctcac ccccgtgccg tccctcgtgt 1200 tcacgtgtgt gatgacgctg ctctacgcct tctccaagga catcttctcc gtcatcaact 1260 tcttcagctt cttcaactgg ctctgcgtgg ccctggccat catcggcatg atctggctgc 1320 gccacagaaa gcctgagctt gagcggccca tcaaggtgaa cctggccctg cctgtgttct 1380 tcatcctggc ctgcctcttc ctgatcgccg tctccttctg gaagacaccc gtggagtgtg 1440 gcatcggctt caccatcatc ctcagcgggc tgcccgtcta cttcttcggg gtctggtgga 1500 aaaacaagcc caagtggctc ctccagggca tcttctccac gaccgtcctg tgtcagaagc 1560 tcatgcaggt ggtcccccag gagacatagc caggaggccg agtggctgcc ggaggagcat 1620 gcgcagaggc cagttaaagt agatcacctc ctcgaaccca ctccggttcc ccgcaaccca 1680 cagctcagct gcccatccca gtccctcgcc gtccctccca ggtcgggcag tggaggctgc 1740 tgtgaaaact ctggtacgaa tctcatccct caactgaggg ccagggaccc aggtgtgcct 1800 gtgctcctgc ccaggagcag cttttggtct ccttgggccc tttttccctt ccctcctttg 1860 tttacttata tatatatttt ttttaaactt aaattttggg tcaacttgac accactaaga 1920 tgatttttta aggagctggg ggaaggcagg agccttcctt tctcctgccc caagggccca 1980 gaccctgggc aaacagagct actgagactt ggaacctcat tgctacgaca gacttgcact 2040 gaagccggac agctgcccag acacatgggc ttgtgacatt cgtgaaaacc aaccctgtgg 2100 gcttatgtct ctgccttagg gtttgcagag tggaaactca gccgtagggt ggcactggga 2160 gggggtgggg gatctgggca aggtgggtga ttcctcccag gaggtgcttg aggccccgat 2220 ggactcctga ccataatcct agccccgaga caccatcctg agccagggaa cagccccagg 2280 gttggggggt gccggcatct cccctagctc accaggcctg gcctctgggc agtgtggcct 2340 cttggctatt tctgttccag ttttggaggc tgagttctgg ttcatgcaga caaagccctg 2400 tccttcagtc ttctagaaac agagacaaga aaggcagaca caccgcggcc aggcacccat 2460 gtgggcgccc accctgggct ccacacagca gtgtcccctg ccccagaggt cgcagctacc 2520 ctcagcctcc aatgcattgg cctctgtacc gcccggcagc cccttctggc cggtgctggg 2580 ttcccactcc cggcctaggc acctccccgc tctccctgtc acgctcatgt cctgtcctgg 2640 tcctgatgcc cgttgtctag gagacagagc caagcactgc tcacgtctct gccgcctgcg 2700 tttggaggcc cctgggctct cacccagtcc ccacccgcct gcagagaggg aactagggca 2760 ccccttgttt ctgttgttcc cgtgaatttt tttcgctatg ggaggcagcc gaggcctggc 2820 caatgcggcc cactttcctg agctgtcgct gcctccatgg cagcagccaa ggacccccag 2880 aacaagaaga cccccccgca ggatccctcc tgagctcggg gggctctgcc ttctcaggcc 2940 ccgggcttcc cttctcccca gccagaggtg gagccaagtg gtccagcgtc actccagtgc 3000 tcagctgtgg ctggaggagc tggcctgtgg cacagccctg agtgtcccaa gccgggagcc 3060 aacgaagccg gacacggctt cactgaccag cggctgctca agccgcaagc tctcagcaag 3120 tgcccagtgg agcctgccgc ccccacctgg gcaccgggac cccctcacca tccagtgggc 3180 ccggagaaac ctgatgaaca gtttggggac tcaggaccag atgtccgtct ctcttgcttg 3240 aggaatgaag acctttattc acccctgccc cgttgcttcc cgctgcacat ggacagactt 3300 cacagcgtct gctcatagga cctgcatcct tcctggggac gaattccact cgtccaaggg 3360 acagcccacg gtctggaggc cgaggaccac cagcaggcag gtggactgac tgtgttgggc 3420 aagacctctt ccctctgggc ctgttctctt ggctgcaaat aaggacagca gctggtgccc 3480 cacctgcctg gtgcattgct gtgtgaatcc aggaggcagt ggacatcgta ggcagccacg 3540 gccccgggtc caggagaagt gctccctgga ggcacgcacc actgcttccc actggggccg 3600 gcggggccca cgcacgacgt cagcctctta ccttcccgcc tcggctaggg gtcctcggga 3660 tgccgttctg ttccaacctc ctgctctggg aggtggacat gcctcaagga tacagggagc 3720 cggcggcctc tcgacggcac gcacttgcct gttggctgct gcggctgtgg gcgagcatgg 3780 gggctgccag cgtctgttgt ggaaagtagc tgctagtgaa atggctgggg ccgctggggt 3840 ccgtcttcac actgcgcagg tctcttctgg gcgtctgagc tggggtggga gctcctccgc 3900 agaaggttgg tggggggtcc agtctgtgat ccttggtgct gtgtgcccca ctccagcctg 3960 gggaccccac ttcagaaggt aggggccgtg tcccgcggtg ctgactgagg cctgcttccc 4020 cctccccctc ctgctgtgct ggaattccac agggaccagg gccaccgcag gggactgtct 4080 cagaagactt gatttttccg tccctttttc tccacactcc actgacaaac gtccccagcg 4140 gtttccactt gtgggcttca ggtgttttca agcacaaccc accacaacaa gcaagtgcat 4200 tttcagtcgt tgtgcttttt tgttttgtgc taacgtctta ctaatttaaa gatgctgtcg 4260 gcaccatgtt tatttatttc cagtggtcat gctcagcctt gctgctctgc gtggcgcagg 4320 tgccatgcct gctccctgtc tgtgtcccag ccacgcaggg ccatccactg tgacgtcggc 4380 cgaccaggct ggacaccctc tgccgagtaa tgacgtgtgt ggctgggacc ttctttattc 4440 tgtgttaatg gctaacctgt tacactgggc tgggttgggt agggtgttct ggcttttttg 4500 tggggttttt atttttaaag aaacactcaa tcatcctag 4539 3 1559 DNA Homo sapiens 3 attggccggt aagcagagca tggcgggtgc gggcccgaag cggcgcgcgc tagcggcgcc 60 ggcggccgag gagaaggaag aggcgcggga gaagatgctg gccgccaaga gcgcggacgg 120 ctcggcgccg gcaggcgagg gcgagggcgt gaccctgcag cggaacatca cgctgctcaa 180 cggcgtggcc atcatcgtgg ggaccattat cggctcgggc atcttcgtga cgcccacggg 240 cgtgctcaag gaggcaggct cgccggggct ggcgctggtg gtgtgggccg cgtgcggcgt 300 cttctccatc gtgggcgcgc tctgctacgc ggagctcggc accaccatct ccaaatcggg 360 cggcgactac gcctacatgc tggaggtcta cggctcgctg cccgccttcc tcaagctctg 420 gatcgagctg ctcatcatcc ggccttcatc gcagtacatc gtggccctgg tcttcgccac 480 ctacctgctc aagccgctct tccccacctg cccggtgccc gaggaggcag ccaagctcgt 540 ggcctgcctc tgcgtgctgc tgctcacggc cgtgaactgc tacagcgtga aggccgccac 600 ccgggtccag gatgcctttg ccgccgccaa gctcctggcc ctggccctga tcatcctgct 660 gggcttcgtc cagatcggga agggtgatgt gtccaatcta gatcccaact tctcatttga 720 aggcaccaaa ctggatgtgg ggaacattgt gctggcatta tacagcggcc tctttgccta 780 tggaggatgg aattacttga atttcgtcac agaggaaatg atcaacccct acagaaacct 840 gcccctggcc atcatcatct ccctgcccat cgtgacgctg gtgtacgtgc tgaccaacct 900 ggcctacttc accaccctgt ccaccgagca gatgctgtcg tccgaggccg tggccgtgga 960 cttcgggaac tatcacctgg gcgtcatgtc ctggatcatc cccgtcttcg tgggcctgtc 1020 ctgcttcggc tccgtcaatg ggtccctgtt cacatcctcc aggctcttct tcgtggggtc 1080 ccgggaaggc cacctgccct ccatcctctc catgatccac ccacagctcc tcacccccgt 1140 gccgtccctc gtgttcacgt gtgtgatgac gctgctctac gccttctcca aggacatctt 1200 ctccgtcatc aacttcttca gcttcttcaa ctggctctgc gtggccctgg ccatcatcgg 1260 catgatctgg ctgcgccaca gaaagcctga gcttgagcgg cccatcaagg tgaacctggc 1320 cctgcctgtg ttcttcatcc tggcctgcct cttcctgatc gccgtctcct tctggaagac 1380 acccgtggag tgtggcatcg gcttcaccat catcctcagc gggctgcccg tctacttctt 1440 cggggtctgg tggaaaaaca agcccaagtg gctcctccag ggcatcttct ccacgaccgt 1500 cctgtgtcag aagctcatgc aggtggtccc ccaggagaca tagccaggag gccgagtgg 1559 4 923 DNA Homo sapiens 4 ggcacgaggc ctcgctcggc tgcggctctc gggtgtccca ggcccggccg gtaagcagag 60 catggcgggt gcgggcccga agcggcgggc gctagcggcc ccggtggccg aggagaagga 120 agaggcgcgg gagaagatgc tggcctccaa gcgcgcggac ggcgcggcgc cggcaggcga 180 gggcgagggc gtgaccctgc agcggaacat cacgctactc aacggcgtgg ccatcatcgt 240 gggcgccatc atcggctcgg gcatcttcgt gacgcccacg ggcgtgctta aggaggcagg 300 ctcgccgggg ctggcgctgg tgatgtgggc cgcgtgcggc gtcttctcca tcgtgggcgc 360 gctctgctac gcggagctgg gcaccaccat ctccaaatcg ggcggcgact acgcctacat 420 gctggacgtc tacggctcgc tgcccgcctt cctcaagctc tggatcgagc tgctcgtcat 480 ccggccttca tcgcagtaca tcgtggccct ggtcttcgcc acctacctgc tcaagccgct 540 cttccccagc tgcccggtgc ccgaggaggc agccaagctc atggcctgcc actgcgtgcg 600 tgccctggca ggagggactg gcttacgcca ccctgtgcag gatcgtcttc tcagcttaca 660 gagcctagtg tgattcaaca accctccatc tctgccccca acatgcccaa ctgatctggt 720 ggagggaact gaagatgaag gcctctcagg caccatggat gggatgggtg tgtgtgaagg 780 gggcagggtc cccaggagct gccccaggtc tggctggaca gttgcacggg aagtcctgtt 840 tgtgaactca acgtttccac ggcttttcca ttaaacttta ccccaaatca aaaaaaaaaa 900 aacaaaaaaa aaaaaaaaaa aaa 923 5 1609 DNA Homo sapiens 5 gctcgctggg ccgctgctcc cgggtgtccc aggcccggcc ggtgcgcaga gcatggcggg 60 tgcgggcccg aagcggcgcg cgctagcggc gccggcggcc gaggagaagg aagaggcgcg 120 ggagaagatg ctggccgcca agagcgcgga cggctcggcg ccggcaggcg agggcgaggg 180 cgtgaccctg cagcggaaca tcacgctgct caacggcgtg gccatcatcg tggggaccat 240 tatcggctcg ggcatcttcg tgacgcccac gggcgtgctc aaggaggcag gctcgccggg 300 gctggcgctg gtggtgtggg ccgcgtgcgg cgtcttctcc atcgtgggcg cgctctgcta 360 cgcggagctc ggcaccacca tctccaaatc gggcggcgac tacgcctaca tgctggaggt 420 ctacggctcg ctgcccgcct tcctcaagct ctggatcgag ctgctcatca tccggccttc 480 atcgcagtac atcgtggccc tggtcttcgc cacctacctg ctcaagccgc tcttccccac 540 ctgcccggtg cccgaggagg cagccaagct cgtggcctgc ctgtgcgtgc tgctgctcac 600 ggccgtgaac tgctacagcg tgaaggccgc cacccgggtc caggatgcct ttgccgccgc 660 caagctcctg gccctggccc tgatcatcct gctgggcttc gtccagatcg gaaagggtga 720 tgtgtccaat ctagatccca agttctcatt tgaaggcacc aaactggatg tggggaacat 780 tgtgctggca ttatacagcg gcctctttgc ctatggagga tggaattact tgaatttcgt 840 cacagaggaa atgatcaacc cctacagaaa cctgcccctg gccatcatca tctccctgcc 900 catcgtgacg ctggtgtacg tgctgaccaa cctggcctac ttcaccaccc tgtccaccga 960 gcagatgctg tcgtccgagg ccgtggccgt ggacttcggg aactatcacc tgggcgtcat 1020 gtcctggatc atccccgtct tcgtgggcct gtcctgcttt ggctccgtca atgggtccct 1080 gttcacatcc tccaggctct tcttcgtggg gtcccgggaa ggccacctgc cctccatcct 1140 ctccatgatc cacccacagc tcctcacccc cgtgccgtcc ctcgtgttca cgtgtgtgat 1200 gacgctgctc tacgccttct ccaaggacat cttctccgtc atcaacttct tcagcttctt 1260 caactggctc tgcgtggccc tggccatcat cggcatgatc tggctgcgcc acagaaagcc 1320 tgagcttgag cggcccatca aggtgaacct ggccctgcct gtgttcttca tcctggcctg 1380 cctcttcctg atcgccgtct ccttctggaa gacacccgtg gagtgtggca tcggcttcac 1440 catcatcctc agcgggctgc ccgtctactt cttcggggtc tggtggaaaa acaagcccaa 1500 gtggctcctc cagggcatct tctccacgac cgtcctgtgt cagaagctca tgcaggtggt 1560 cccccaggag acatagccag gaggccgagt ggctgccgga ggagcatgc 1609 6 3984 DNA Homo sapiens 6 gtcctttcac gcgtgtcttc gtgttggtgc gcttttcact ggtcataaag tgctgctcac 60 ggccgtgaac tgctacagcg tgaaggccgc cacccgggtc caggatgctt ttgccgccgc 120 caagctcctg gccctggccc tgatcatcct gctgggcttc gtccagatcg ggaagggtga 180 tgtgtccaat ctagatccca agttctcatt tgaaggcacc aaactggatg tggggaacat 240 tgtgctggca ttatacagcg gcctctttgc ctatggagga tggaattact tgaatttcgt 300 cacagaggaa atgatcaacc cctacagaaa cctgcccctg gccatcatca tctccctgcc 360 catcgtgacg ctggtgtacg tgctgaccaa cctggcctac ttcaccaccc tgtccaccga 420 gcagatgctg tcgtccgagg ccgtggccgt ggacttcggg aactatcacc tgggcgtcat 480 gtcctggatc atccccgtct tcgtgggcct gtcctgcttt ggctccgtca atgggtccct 540 gttcacatcc tccaggctct tcttcgtggg gtcccgggaa ggccacctgc cctccatcct 600 ctccatgatc cacccacagc tcctcacccc cgtgccgtcc ctcgtgttca cgtgtgtgat 660 gacgctgctc tacgccttct ccaaggacat cttctccgtc atcaacttct tcagcttctt 720 caactggctc tgcgtggccc tggccatcat cggcatgatc tggctgcgcc acagaaagcc 780 tgagcttgag cggcccatca aggtgaacct ggccctgcct gtgttcttca tcctggcctg 840 cctcttcctg atcgccgtct ccttctggaa gacacccgtg gagtgtggca tcggcttcac 900 catcatcctc agcgggctgc ccgtctactt cttcggggtc tggtggaaaa acaagcccaa 960 gtggctcctc cagggcatct tctccacgac cgtcctgtgt cagaagctca tgcaggtggt 1020 cccccaggag acatagccag gaggccgagt ggctgccgga ggagcatgcg cagaggccag 1080 ttaaagtaga tcacctcctc gaacccactc cggttccccg caacccacag ctcagctgcc 1140 catcccagtc ctcgccgtcc ctcccaggtc gggcagtgga ggctgctgtg aaaactctgg 1200 tacgaatctc atccctcaac tgagggccag ggacccaggt gtgcctgtgc tcctgcccag 1260 gagcagcttt tggtctcctt gggccctttt tcccttccct cctttgttta cttatatata 1320 tatttttttt aaacttaaat tttgggtcaa cttgacacca ctaagatgat tttttaagga 1380 gctgggggaa ggcaggagcc ttcctttctc ctgccccaag ggcccagacc ctgggcaaac 1440 agagctactg agacttggaa cctcattgct accacagact tgcactgaag ccagacagct 1500 gcccagacac atgggcttgt gacattcgtg aaaaccaacc ctgtgggctt atgtctctgc 1560 cttagggttt gcagagtgga aactcagccg tagggtggca ctgggagggg gtgggggatc 1620 tgggcaaggt gggtgattcc tcccaggagg tgcttgaggc cccgatggac tcctgaccat 1680 aatcctagcc ccgagacacc atcctgagcc agggaacagc cccagggttg gggggtgccg 1740 gcatctcccc tagctcacca ggcctggcct ctgggcagtg tggcctcttg gctatttctg 1800 ttccagtttt ggaggctgag ttctggttca tgcagacaaa gccctgtcct tcagtcttct 1860 agaaacagag acaagaaagg cagacacacc gcggccaggc acccatgtgg gcgcccaccc 1920 tgggctccac acagcagtgt cccctgcccc agaggtcgca gctaccctca gcctccaatg 1980 cattggcctc tgtaccgccc ggcagcccct tctggccggt gctgggttcc cactcccggc 2040 ctaggcacct ccccgctctc cctgtcacgc tcatgtcctg tcctggtcct gatgcccgtt 2100 gtctaggaga cagagccaag cactgctcac gtctctgccg cctgcgtttg gaggcccctg 2160 ggctctcacc cagtccccac ccgcctgcag agagggaact agggcacccc ttgtttctgt 2220 tgttcccgtg aatttttttc gctatgggag gcagccgagg cctggccaat gcggcccact 2280 ttcctgagct gtcgctgcct ccatggcagc agccaaggac ccccagaaca agaagacccc 2340 cccgcaggat ccctcctgag ctcggggggc tctgccttct caggccccgg gcttcccttc 2400 tccccagcca gaggtggagc caagtggtcc agcgtcactc cagtgctcag ctgtggctgg 2460 aggagctggc ctgtggcaca gccctgagtg tcccaagccg ggagccaacg aagccggaca 2520 cggcttcact gaccagcggc tgctcaagcc gcaagctctc agcaagtgcc cagtggagcc 2580 tgccgccccc acctgggcac cgggaccccc tcaccatcca gtgggcccgg agaaacctga 2640 tgaacagttt ggggactcag gaccagatgt ccgtctctct tgcttgagga atgaagacct 2700 ttattcaccc ctgccccgtt gcttcccgct gcacatggac agacttcaca gcgtctgctc 2760 ataggacctg catccttcct ggggacgaat tccactcgtc caagggacag cccacggtct 2820 ggaggccgag gaccaccagc aggcaggtgg actgactgtg ttgggcaaga cctcttccct 2880 ctgggcctgt tctcttggct gcaaataagg acagcagctg gtgccccacc tgcctggtgc 2940 attgctgtgt gaatccagga ggcagtggac atcgtaggca gccacggccc caggtccagg 3000 agaagtgctc cctggaggca cggaccactg cttcccactg gggccggcgg ggcccacgca 3060 cgacgtcagc ctcttacctt cccgcctcgg ctaggggtcc tcgggatgcc gttctgttcc 3120 aacctcctgt tctgggaggt ggacatgcct caaggataca gggagccggc ggcctctcga 3180 cggcacgcac ttcctgttgg ctgctgcggc tgtgggcgag catgggggct gccagcgtct 3240 gttgtggaaa gtagctgcta gtgaaatggc tggggccgct ggggtccgtc ttcacactgc 3300 gcaggtctct tctgggcgtc tgagctgggg tgggagctcc tccgcagaag gttggtgggg 3360 ggtccagtct gtgatccttg gtgctgtgtg ccccactcca gcctggggac cccacttcag 3420 aaggtagggg ccgtgtcccg cggtgctgac tgaggcctgc ttccccctcc ccctcctgct 3480 gtgctggaat tccacaggga ccagggccac cgcaggggac tgtctcagaa gacttgattt 3540 ttccgtccct ttttctccac actccactga caaacgtccc cagcggtttc cacttgtggg 3600 cttcaggtgt tttcaagcac aacccaccac aacaagcaag tgcattttca gtcgttgtgc 3660 ttttttgttt tgtgctaacg tcttactaat ttaaagatgc tgtcggcacc atgtttattt 3720 atttccagtg gtcatgctca gccttgctgc tctgcgtggc gcaggtgcca tgcctgctcc 3780 ctgtctgtgt cccagccacg cagggccatc cactgtgacg tcggccgacc aggctggaca 3840 ccctctgccg agtaatgacg tgtgtggctg ggaccttctt tattctgtgt taatggctaa 3900 cctgttacac tgggctgggt tgggtagggt gttctggctt ttttgtgggg tttttatttt 3960 taaagaaaca ctcaatcatc ctag 3984 7 1621 DNA Homo sapiens 7 cggaaggtgc ctgtgacgag gattggccgg taagcagagc atggcgggtg cgggcccgaa 60 gcggcgggcg ctagcggccc cggtggccga ggagaaggaa gaggcgcggg agaagatgct 120 ggcctccaag cgcgcggacg gcgcggcgcc ggcaggcgag ggcgagggcg tgaccctgca 180 gcggaacatc acgctactca acggcgtggc catcatcgtg ggcgccatca tcggctcggg 240 catcttcgtg acgcccacgg gcgtgcttaa ggaggcaggc tcgccggggc tggcgctggt 300 gatgtgggcc gcgtgcggcg tcttctccat cgtgggcgcg ctctgctacg cggagctcgg 360 caccaccatc tccaaatcgg gcggcgacta cgcctacatg ctggaggtct acggctcgct 420 gcccgccttc ctcaagctct ggatcgagct gctcatcatc cggccttcat cgcagtacat 480 cgtggccctg gtcttcgccg cctacctgct caagccgctc ttccccacct gcccggtgcc 540 cgaggaggca gccaagctcg tggcctgcct ctgcgtgctg ctgctcacgg ccgtgaactg 600 ctacagcgtg aaggccgcca cccgggtcca ggatgccttt gccgccgcca agctcctggc 660 cctggccctg atcatcctgc tgggcttcgt ccagatcggg aagggtgatg tgtccaatct 720 agatcccaac ttctcatttg aaggcaccaa actggatgtg gggaacattg tgctggcatt 780 atacagcggc ctctttgcct atggaggatg gaattacttg aatttcgtca cagaggaaat 840 gatcaacccc tacagaaacc tgcccctggc catcatcatc tccctgccca tcgtgacgct 900 ggtgtacgtg ctgaccaacc tggcctactt caccaccctg tccaccgagc agatgctgtc 960 gtccgaggcc gtggccgtgg acttcgggaa ctatcacctg ggcgtcatgt cctggatcat 1020 ccccgtcttc gtgggcctgt cctgcttcgg ctccgtcaat gggtccctgt tcacatcctc 1080 caggctcttc ttcgtggggt cccgggaagg ccacctgccc tccatcctct ccatgatcca 1140 cccacagctc ctcacccccg tgccgtccct cgtgttcacg tgtgtgatga cgctgctcta 1200 cgccttctcc aaggacatct tctccgtcat caacttcttc agcttcttca actggctctg 1260 cgtggccctg gccatcatcg gcatgatctg gctgcgccac agaaagcctg agcttgagcg 1320 gcccatcaag gtgaacctgg ccctgcctgt gttcttcatc ctggcctgcc tcttcctgat 1380 cgccgtctcc ttctggaaga cacccgtgga gtgtggcatc ggcttcacca tcatcctcag 1440 cgggctgccc gtctacttct tcggggtctg gtggaaaaac aagcccaagt ggctcctcca 1500 gggcatcttc tccacgaccg tcctgtgtca gaagctcatg caggtggtcc cccaggagac 1560 atagccagga ggccgagtgg ctgccggagg agcatgcgca gaggccagtt aaagtaaggg 1620 c 1621 8 6295 DNA Homo sapiens 8 agagtttatg tggccgaggc agacaagtgg aattaggcct tgctgcaggg gacttcattt 60 ccttctcagt actggaccca tttatgagga ggtggcttat gaaagtgtga tgttcgcgta 120 tttcttgaca ggcagtggcg tgatcttggc tcactgcaac ctccgactcc ctggttcaag 180 cgattctcct gcctcagcct cctgagtggg gattacaggc cacagcaaac acaggtgtgc 240 aggaaccgtt tgtcatggaa gccagggagc ctgggaggcc cacacccacc taccatcttg 300 tccctaacac cagccagtcc caggtggaag aagatgtcag ctcgccacct caaaggtcct 360 ccgaaactat gcagctgaag aaggagatct ccctgctgaa tggggtcagc ctggtggtgg 420 gcaacatgat cggctcaggg atctttgtct cacccaaggg tgtgctggta cacactgcct 480 cctatgggat gtcactgatt gtgtgggcca ttggtgggct cttctctgtt gtgggtgccc 540 tttgttatgc agagctgggg accaccatca ccaagtcggg agccagctac gcttatattc 600 tagaggcctt tgggggcttc attgccttca tccgcctgtg ggtctcactg ctagttgttg 660 agcccaccgg tcaggccatc atcgccatca cctttgccaa ctacatcatc cagccgtcct 720 tccccagctg tgatccccca tacctggcct gccgtctcct ggctgctgct tgcatatgtc 780 tgctgacatt tgtgaactgt gcctatgtca agtggggcac acgtgtgcag gacacgttca 840 cttacgccaa ggtcgtagcg ctcattgcca tcattgtcat gggccttgtt aaactgtgcc 900 agggacactc tgagcacttt caggacgcct ttgagggttc ctcctgggac atgggaaacc 960 tctctcttgc cctctactct gccctcttct cttactcagg ttgggacacc cttaattttg 1020 taacagaaga aatcaaaaac ccagaaagaa atttgccctt ggccattggg atttctatgc 1080 caattgtgac gctcatctac atcctgacca atgtggccta ttacacagtg ctgaacattt 1140 cagatgtcct tagcagtgat gctgtggctg tgacatttgc tgaccagacg tttggcatgt 1200 tcagctggac catccccatt gctgttgccc tgtcctgctt tgggggcctc aatgcatcca 1260 tctttgcttc atcaaggttg ttcttcgtgg gctcccggga gggccaccta ccggaccttc 1320 tgtccatgat ccacattgag cgttttacac ctatccctgc tttactgttc aattgcacca 1380 tggcactcat ctacctcatc gtggaggatg ttttccagct tatcaactac ttcagcttca 1440 gctactggtt cttcgtgggc ctgtctgttg ttggacagct ctacctccgc tggaaggagc 1500 ccaagcggcc ccggcctctc aagctgagcg tgtttttccc catcgtgttc tgcatatgct 1560 ccgtgtttct ggtgatagtg cccctcttca ctgacaccat taattccctc attggcatcg 1620 ggattgccct ttctggagtc cctttctact tcatgggtgt ttacctgcca gagtcccgga 1680 ggccattgtt tattcggaat gtcctggctg ctatcaccag aggcacccag cagctttgct 1740 tttgtgtcct gactgagctt gatgtagccg aagaaaaaaa ggatgagagg aaaactgact 1800 agaggtcaga ggtggctttc tgaggcctgg aaggcaggcc aaccagcaaa atcctgataa 1860 caagactctg tgggcccaac tctcctgaat taaaggagcc ttttgaccca atcatatagt 1920 ggggctcagg gccagtgctc actcttattg gtaagctata ggagactcag gatctgggcc 1980 aacctcaagg tgggggcttc agagggtggg gggaagattg gggaacgggg ggaatggtca 2040 tttagtttta ctcctgatag gtagatgcag ctcttacaga tatttacttg gtaaagtgca 2100 gtggggaaga gggaatgcta ggttgatagg gctggtggct tctgaatttg gtatttgaac 2160 taggagtccc tatagagggg ctgctttatg ggaagttttt ctctgaccag gtacaacacc 2220 tgactttaaa ggcctgaaat gctaccattt cttcctctgg ctcaaaattc ttccctgggg 2280 agagagttat attcccttat ttattgatat ttagtccaga acaccagttc taacgaagca 2340 tgcgtgtctc ttcatctaca ggatgcaata ggctgattgt atttaaaaat caaagtaccc 2400 aaaactgagt ccctttgggc tcagaaatgt ctgtggtatt gggtcagact ctgaccacag 2460 gttttatgct gtttagcaca atttctattg agtcttacct gcaacaatga accttaaaga 2520 tttttttact cacgtacctg ttacacttta gcatacagat agatcataga tcacgttaca 2580 agcacttggc tcaggtccag caaggacaga tgaacaaatt cctgagtcag aagtctgtta 2640 atattgctgt tttgaaggac aatcctttat tttacttgag accttacatc tttgttctag 2700 ctgacagtaa atctctgggt ttctgttacg aactctaaga gggctgaaac ttctgatatt 2760 caggtggatc acctgaattc tctcagctgt caatggcttg gagaacatct catgggccca 2820 agtcatcaaa taacctgttc ctctctgtaa gggcagtgtg agggactgct gtgcagaccc 2880 aagcaatccc aacctggtgc taggtcattt cacttttctg aaaacctcac atcaggctgc 2940 atcctcttct gtccctggca ccaggctttg tttacacttg gagccacctt ggtgtgggtc 3000 accgggacag tgtactcctc tcctgccagc ctccccttcc ccgaggtgtg gtggctgcag 3060 tctcaggaag agcttggtac ttgtggggac ttctgttttc tccctgtgga gatcagtgaa 3120 gactgggagg aaagctgctt caacctgagt ccggctcttc agcaggctgc acaagtggaa 3180 gcaactaatt ctggtgctca ggctgggctc tccacccaag ttaggcctgc tctggcctaa 3240 tggatcttac tgtatgagca ggacggctgc attggattgt acaactgttt tgtgatgccc 3300 ccagacactg tcatcctggg ccgagaagaa cctgctagct tgacataccc catgggctta 3360 tccttaggtt ttggaattgg tcaacagtga ggcagtctcc cttcctgacc attcttctcc 3420 acccagtcac agataaggga ataaccttgg ccatatattt gctcaataaa gattgaagga 3480 agcatggtca tagttgccct gggttcagag cataatgcat atgtgaagca tggggtgaca 3540 ttcctactgt catgggtttg ggatttgtaa cggcaaattc ctgcccgacg acagggtgtc 3600 ttatgcaaag gctgacttgc ctgaacgcta agaacatgac ttctgtctga gctaagctgg 3660 cacccatccc aggggctcct ctggagctaa tcctttaagc aaaatgtgct tgccttttaa 3720 agatccctga ccccagcttt agctttctcc accagataac cagctaatcc caggaatttg 3780 ctgcccccca ccagtggctt ctagggaaag caaggacctc acatgccagg tgccctagta 3840 cttgcttagt gagccatgtc atcctccttt catttttgga tggtgacagc atttttcccc 3900 tctgtgctgg atacagactt ctcccaggat cctctctttg ggagcgaagc cagaggatcc 3960 ctacagcact caagcttcat ggtggaatta atttctgcca gctctttgtt gtctgtctcc 4020 ttaaatcctt ttcctggtgt gcttattatc ccttttgcag tgagtacagt ttattaagtt 4080 gtcagccctt taatattggg gaaacttaat gagtataaat agcagggagc acattgtaac 4140 agcacagtgt tttgtttttt tcacccggtt gctgtatgag aatggctttc aatcctttgt 4200 ttctatgcct acagacagaa agcaagatgt ctaatattag acatacaagt tgctgcctgt 4260 tataacggtg aattatacct ttgtgcatgc ctaggatgtt tgttgtttta attagctgca 4320 atatatacgg cctgtgtaca cagaatttaa tcacttcggc aggttgaaca actccatgta 4380 gataagagca agtgtaggca aaggtttaga aaatggacat aaagtcaaag aatgatggca 4440 ggtaggatga aggagagata cttaggaaat cctaaaagag gcggcaagaa ggtacctccc 4500 tgtgtaactc accttccccc atgacagtga gtaagagaca ctcacaggct atgagggtac 4560 acccctagct gaatgttctg tgttgtttcc ttagacctgt ggtgtccgct gcaacagcta 4620 ctagccacgt gtagctaatt acattaaaat gaaataaaat taaaagctca gtttctcagt 4680 tgcgctaatc acatttcaag tgctcagcag ccacccgtgt ctactactac acagtgcaga 4740 cacagaacat atcatcactg cagatagttc tactggacaa tgttacgcta gaataaacac 4800 caaggcagtc agttaaggca gctatggttt ggaaaggcat acggacagag tctgcttaga 4860 agagatacaa gttgttaata aaattgatcc tgttgatagt agtttgtttt tgtggtgggt 4920 gctgtgaaga gtaaacatta ctcagtggaa agctaagttc agaaggtact ttgtttttcc 4980 tcccttgcct taagtccttg gtatttataa tcaatgctga accttctatt tcactaccgc 5040 tccctgtttt agatattcag atttaaaagg ttttcaaaga attactttct tccatgttca 5100 aagctagatt ttactaaaca catgtatcac attcatatat attgtttctt ggccccactg 5160 ccaaaggaag tcagtcagta atttcacaac cgttatcaga gtttggaagc agaaatagct 5220 gttaactaaa atctcccact gctcagacta ctttctgccc taatggccat tactatccag 5280 tctgtattgc tacaagggac ccactggtac cccttttaga ttctatcaaa aggaacaggg 5340 ttttcctaga ggcaggcagc ctggtggtat ggcacagcag aagcttactg ctaatgaaat 5400 gggaacctcc ccctcccttg tggtttcagc acagaacctg aatgccagga aaaattcctg 5460 ggccaagaag ctaaagctaa agaaaccttc cttttttcaa cgtttttttt tctttcaaac 5520 tgtagggtca cttttgattg aggcaaaggg gtcctactgt aagtggaaaa gactcactcc 5580 cctaacataa gttttcactg tggtgggatg gtgccgcccg atatgcttga tatgcttttc 5640 cgttccacat gttaagctag gaaacctaac aggatgtcag cagggcagtt aactctggac 5700 tcagagccct caagggcatg tggcagaacc tcatggacat cacaagacca tcagtctgaa 5760 tccaggtcgt gggggctgtc atagccgaac tccttctgca catccagagg gtacttgctc 5820 cacatccgct gtctgctgct gcctctttcc tcctcactca ggctgttgta gtcagcagag 5880 cctagaatga catcccggga gtggattcta aatgtgattt tcctaggcta ctgcaggagc 5940 cccttctctt ctcagaaagg tctgtttttg ttcccgattg taatgcaaaa tccttgctca 6000 ataaataaaa aagaatatag aattcttttt tttttaaaga aggaatcact ttcctatcat 6060 ctaaaccaag ttccttcaca ctggagtatt ttgtcacttc tcccctccgt ggagtatttt 6120 gtcacttctc ccctccgtat aggatttttt gttgttgtaa gagttgtagt catattgtaa 6180 atatttttgt acctttctcc ttttaacgtg ttattgacaa acctccccaa aagaatatgc 6240 aattgtttga ttcatttctc tgttatcaga caccaataaa ttctttttgt tgggc 6295 9 6296 DNA Homo sapiens 9 cactgggaga gtttatgtgg ccgaggcaga caagtggaat taggccttgc tgcaggggac 60 ttcatttcct tctcagtact ggacccattt atgaggaggt ggcttatgaa agtgtgatgt 120 tcgcgtattt cttgacaggc agtggcgtga tcttggctca ctgcaacctc cgactccctg 180 gttcaagcga ttctcctgcc tcagcctcct gagtggggat tacaggccac agcaaacaca 240 ggtgtgcagg aaccgtttgt catggaagcc agggagcctg ggaggcccac acccacctac 300 catcttgtcc ctaacaccag ccagtcccag gtggaagaag atgtcagctc gccacctcaa 360 aggtcctccg aaactatgca gctgaagaag gagatctccc tgctgaatgg ggtcagcctg 420 gtggtgggca acatgatcgg ctcagggatc tttgtctcac ccaagggtgt gctggtacac 480 actgcctcct atgggatgtc actgattgtg tgggccattg gtgggctctt ctctgttgtg 540 ggtgcccttt gttatgcaga gctggggacc accatcacca agtcgggagc cagctacgct 600 tatattctag aggcctttgg gggcttcatt gccttcatcc gcctgtgggt ctcactgcta 660 gttgttgagc ccaccggtca ggccatcatc gccatcacct ttgccaacta catcatccag 720 ccgtccttcc ccagctgtga tcccccatac ctggcctgcc gtctcctggc tgctgcttgc 780 atatgtctgc tgacatttgt gaactgtgcc tatgtcaagt ggggcacacg tgtgcaggac 840 acgttcactt acgccaaggt cgtagcgctc attgccatca ttgtcatggg ccttgttaaa 900 ctgtgccagg gacactctga gcactttcag gacgcctttg agggttcctc ctgggacatg 960 ggaaacctct ctcttgccct ctactctgcc ctcttctctt actcaggttg ggacaccctt 1020 aattttgtaa cagaagaaat caaaaaccca gaaagaaatt tgcccttggc cattgggatt 1080 tctatgccaa ttgtgacgct catctacatc ctgaccaatg tggcctatta cacagtgctg 1140 aacatttcag atgtccttag cagtgatgct gtggctgtga catttgctga ccagacgttt 1200 ggcatgttca gctggaccat ccccattgct gttgccctgt cctgctttgg gggcctcaat 1260 gcatccatct ttgcttcatc aaggttgttc ttcgtgggct cccgggaggg ccacctaccg 1320 gaccttctgt ccatgatcca cattgagcgt tttacaccta tccctgcttt actgttcaat 1380 tgcaccatgg cactcatcta cctcatcgtg gaggatgttt tccagcttat caactacttc 1440 agcttcagct actggttctt cgtgggcctg tctgttgttg gacagctcta cctccgctgg 1500 aaggagccca agcggccccg gcctctcaag ctgagcgtgt ttttccccat cgtgttctgc 1560 atatgctccg tgtttctggt gatagtgccc ctcttcactg acaccattaa ttccctcatt 1620 ggcatcggga ttgccctttc tggagtccct ttctacttca tgggtgttta cctgccagag 1680 tcccggaggc cattgtttat tcggaatgtc ctggctgcta tcaccagagg cacccagcag 1740 ctttgctttt gtgtcctgac tgagcttgat gtagccgaag aaaaaaagga tgagaggaaa 1800 actgactaga ggtcagaggt ggctttctga ggcctggaag gcaggccaac cagcaaaatc 1860 ctgataacaa gactctgtgg gcccaactct cctgaattaa aggagccttt tgacccaatc 1920 atatagtggg gctcagggcc agtgctcact cttattggta agctatagga gactcaggat 1980 ctgggccaac ctcaaggtgg gggcttcaga gggtgggggg aagattgggg aacgggggga 2040 atggtcattt agttttactc ctgataggta gatgcagctc ttacagatat ttacttggta 2100 aagtgcagtg gggaagaggg aatgctaggt tgatagggct ggtggcttct gaatttggta 2160 tttgaactag gagtccctat agaggggctg ctttatggga agtttttctc tgaccaggta 2220 caacacctga ctttaaaggc ctgaaatgct accatttctt cctctggctc aaaattcttc 2280 cctggggaga gagttatatt cccttattta ttgatattta gtccagaaca ccagttctaa 2340 cgaagcatgc gtgtctcttc atctacagga tgcaataggc tgattgtatt taaaaatcaa 2400 agtacccaaa actgagtccc tttgggctca gaaatgtctg tggtattggg tcagactctg 2460 accacagatt ttatgctgtt tagcacaatt tctattgagt cttacctgca acaatgaacc 2520 ttaaagattt ttttactcac gtacctgtta cactttagca tacagataga tcatagatca 2580 cgttacaagc acttggctca ggtccagcaa ggacagatga acaaattcct gagtcagaag 2640 tctgttaata ttgctgtttt gaaggacaat cctttatttt acttgagacc ttacatcttt 2700 gttctagctg acagtaaatc tctgggtttc tgttacgaac tctaagaggg ctgaaacttc 2760 tgatattcag gtggatcacc tgaattctct cagctgtcaa tggcttggag aacatctcat 2820 gggcccaagt catcaaataa cctgttcctc tctgtaaggg cagtgtgagg gactgctgtg 2880 cagacccaag caatcccaac ctggtgctag gtcatttcac ttttctgaaa acctcacatc 2940 aggctgcatc ctcttctgtc cctggcacca ggctttgttt acacttggag ccaccttggt 3000 gtgggtcacc gggacagtgt actcctctcc tgccagcctc cccttccccg aggtgtggtg 3060 gctgcagtct caggaagagc ttggtacttg tggggacttc tgttttctcc ctgtggagat 3120 cagtgaagac tgggaggaaa gctgcttcaa cctgagtccg gctcttcagc aggctgcaca 3180 agtggaagca actaattctg gtgctcaggc tgggctctcc acccaagtta ggcctgctct 3240 ggcctaatgg atcttactgt atgagcagga cggctgcatt ggattgtaca actgttttgt 3300 gatgccccca gacactgtca tcctaggccg agaagaacct gctagcttga cataccccat 3360 gggcttatcc ttaggttttg gaattggtca acagtgaggc agtctccctt cctgaccatt 3420 cttctccacc cagtcacaga taagggaata accttggcca tatatttgct caataaagat 3480 tgaaggaagc atggtcatag ttgccctggg ttcagagcat aatgcatatg tgaagcatgg 3540 ggtgacattc ctactgtcat gggtttggga tttgtaacgg caaattcctg cccgacgaca 3600 gggtgtctta tgcaaaggct gacttgcctg aacgctaaga acatgacttc tgtctgagct 3660 aagctggcac ccatcccagg gctcctctgg agctaatcct ttaagcaaaa tgtgcttgcc 3720 ttttaaagat ccctgacccc agctttagct ttctccacca gataaccagc taatcccagg 3780 aatttgctgc cccccaccag tggcttctag ggaaagcaag gacctcacat gccaggtgcc 3840 ctagtacttg cttagtgagc catgtcatcc tcctttcatt tttggatggt gacagcattt 3900 ttcccctctg tgctggatac agacttctcc caggatcctc tctttgggag cgaagccaga 3960 ggatccctac agcactcaag cttcatggtg gaattaattt ctgccagctc tttgttgtct 4020 gtctccttaa atccttttcc tggtgtgctt attatccctt ttgcagtgag tacagtttat 4080 taagttgtca gccctttaat attggggaaa cttaatgagt ataaatagca gggagcacat 4140 tgtaacagca cagtgttttg tttttttcac ccggttgctg tatgagaatg gctttcaatc 4200 ctttgtttct atgcctacag acagaaagca agatgtctaa tattagacat acaagttgct 4260 gcctgttata acggtgaatt atacctttgt gcatgcctag gatgtttgtt gttttaatta 4320 gctgcaatat atacggcctg tgtacacaga atttaatcac ttcggcaggt tgaacaactc 4380 catgtagata agagcaagtg taggcaaagg tttagaaaat ggacataaag tcaaagaatg 4440 atggcaggta ggatgaagga gagatactta ggaaatccta aaagaggcgg caagaaggta 4500 cctccctgtg taactcacct tcccccatga cagtaagaga cactcacagg ctatgagggt 4560 acacccctag ctgaatgttc tgtgttgttt ccttagacct gtggtgtccg ctgcaacagc 4620 tactagccac gtgtagctaa ttacattaaa atgaaataaa attaaaagct cagtttctca 4680 gttgcgctaa tcacatttca agtgctcagc agccacccgt gtctactact acacagtgca 4740 gacacagaac atatcatcac tgcagatagt tctactggac aatgttacgc tagaataaac 4800 accaaggcag tcagttaagg cagctatggt ttggaaaggc atacggacag agtctgctta 4860 gaagagatac aagttgttaa taaaattgat cctgttgata gtagtttgtt tttgtggtgg 4920 gtgctgtgaa gagtaaacat tactcagtgg aaagctaagt tcagaaggta ctttgttttt 4980 cctcccttgc cttaagtcct tggtatttat aatcaatgct gaaccttcta tttcactacc 5040 gctccctgtt ttagatattc agatttaaaa ggttttcaaa gaattacttt cttccatgtt 5100 caaagctaga ttttactaaa cacatgtatc acattcatat atattgtttc ttggccccac 5160 tgccaaagga agtcagtcag taatttcaca accgttatca gagtttggaa gcagaaatag 5220 ctgttaacta aaatctccca ctgctcagac tactttctgc cctaatggcc attactatcc 5280 agtctgtatt gctacaaggg acccactggt acccctttta gattctatca aaaggaacag 5340 ggttttccta gaggcaggca gcctggtggt atggcacagc agaagcttac tgctaatgaa 5400 atgggaacct ccccctccct tgtggtttca gcacagaacc tgaatgccag gaaaaattcc 5460 tgggccaaga agctaaagct aaagaaacct tccttttttc aacgtttttt tttctttcaa 5520 actgtagggt cacttttgat tgaggcaaag gggtcctact gtaagtggaa aagactcact 5580 cccctaacat aagttttcac tgtggtggga tggtgccgcc cgatatgctt gatatgcttt 5640 tccttccaca tgttaagcta ggaaacctaa caggatgtca gcagggcagt taactctgga 5700 ctcagagccc tcaagggcat gtggcagaac ctcatggaca tcacaagacc atcagtctga 5760 atccaggtcg tgggggctgt catagccgaa ctccttctgc acatccagag ggtacttgct 5820 ccacatccgc tgtctgctgc tgcctctttc ctcctcactc aggctgttgt agtcagcaga 5880 gcctagaatg acatcccggg agtggattct aaatgtgatt ttcctaggct actgcaggag 5940 ccccttctct tctcagaaag gtctgttttt gttcccgatt gtaatgcaaa atccttgctc 6000 aataaataaa aaagaatata gaattctttt ttttttaaag aaggaatcac tttcctatca 6060 tctaaaccaa gttccttcac actggagtat tttgtcactt ctcccctccg tggagtattt 6120 tgtcacttct cccctccgta taggattttt tgttgttgta agagttgtag tcatattgta 6180 aatatttttg tacctttctc cttttaacgt gttattgaca aacctcccca aaagaatatg 6240 caattgtttg attcatttct ctgttatcag acaccaataa attctttttg ttgggc 6296 10 1581 DNA Homo sapiens 10 tgtgcaggaa ccgtttgtca tggaagccag ggagcctggg aggcccacac ccacctacca 60 tcttgtccct aacaccagcc agtcccaggt ggaagaagat gtcagctcgc cacctcaaag 120 gtcctccgaa actatgcagc tgaagaagga gatctccctg ctgaatgggg tcagcctggt 180 ggtgggcaac atgatcggct cagggatctt tgtctcaccc aagggtgtgc tggtacacac 240 tgcctcctat gggatgtcac tgattgtgtg ggccattggt gggctcttct ctgttgtggg 300 tgccctttgt tatgcagagc tggggaccac catcaccaag tcgggagcca gctacgctta 360 tattctagag gcctttgggg gcttcattgc cttcatccgc ctgtgggtct cactgctagt 420 tgttgagccc accggtcagg ccatcatcgc catcaccttt gccaactaca tcatccagcc 480 gtccttcccc agctgtgatc ccccatacct ggcctgccgt ctcctggctg ctgcttgcat 540 atgtctgctg acatttgtga actgtgccta tgtcaagtgg ggcacacgtg tgcaggacac 600 gttcacttac gccaaggtcg tagcgctcat tgccatcatt gtcatgggcc ttgttaaact 660 gtgccaggga cactctgagc actttcagga cgcctttgag ggttcctcct gggacatggg 720 aaacctctct cttgccctct actctgccct cttctcttac tcaggttggg acacccttaa 780 ttttgtaaca gaagaaatca aaaacccaga aagaaatttg cccttggcca ttgggatttc 840 tatgccaatt gtgacgctca tctacatcct gaccaatgtg gcctattaca cagtgctgaa 900 catttcagat gtccttagca gtgatgctgt ggctgtgaca tttgctgacc agacgtttgg 960 catgttcagc tggaccatcc ccattgctgt tgccctgtcc tgctttgggg gcctcaatgc 1020 atccatcttt gcttcatcaa ggttgttctt cgtgggctcc cgggagggcc acctaccgga 1080 ccttctgtcc atgatccaca ttgagcgttt tacacctatc cctgctttac tgttcaattg 1140 caccatggca ctcatctacc tcatcgtgga ggatgttttc cagcttatca actacttcag 1200 cttcagctac tggttcttcg tgggcctgtc tgttgttgga cagctctacc tccgctggaa 1260 ggagcccaag cggccccggc ctctcaagct gagcgtgttt ttccccatcg tgttctgcat 1320 atgctccgtg tttctggtga tagtgcccct cttcactgac accattaatt ccctcattgg 1380 catcgggatt gccctttctg gagtcccttt ctacttcatg ggtgtttacc tgccagagtc 1440 ccggaggcca ttgtttattc ggaatgtcct ggctgctatc accagaggca cccagcagct 1500 ttgcttttgt gtcctgactg agcttgatgt agccgaagaa aaaaaggatg agaggaaaac 1560 tgactagagg tcagaggtgg c 1581 11 2098 DNA Homo sapiens 11 ggcacgaggg agcatcagac cacagatcct ggaaggcact tctctccctg actgctgctc 60 acactgccgt gagaacctgc ttatatccag gaccaaggag tgagtggcaa tgccaggaag 120 ctggtgaagg gtttcctctc ctccaccatg gttgacagca ctgagtatga agtggcctcc 180 cagcctgagg tggaaacctc ccctttgggt gatggggcca gcccagggcc ggagcaggtg 240 aagctgaaga aggagatctc actgcttaac ggcgtgtgcc tgattgtggg gaacatgatc 300 ggctcgggca tctttgtttc ccccaagggt gtgctcatat acagtgcctc ctttggtctc 360 tctctggtca tctgggctgt cgggggcctc ttctccgtct ttggggccct ttgttatgcg 420 gaactgggca ccaccattaa gaaatctggg gccagctatg cctatatcct ggaggccttt 480 ggaggattcc ttgctttcat cagactctgg acctccctgc tcatcattga gcccaccagc 540 caggccatca ttgccatcac ctttgccaac tacatggtac agcctctctt cccgagctgc 600 ttcgcccctt atgctgccag ccgcctgctg gctgctgcct gcatttgtct cttaaccttc 660 attaactgtg cctatgtcaa atggggaacc ctggtacaag atattttcac ctatgctaaa 720 gtattggcac tgatcgcggt catcgttgca ggcattgtta gacttggcca gggagcctct 780 actcattttg agaattcctt tgagggttca tcatttgcag tgggtgacat tgccctggca 840 ctgtactcag ctctgttctc ctactcaggc tgggacaccc tcaactatgt cactgaagag 900 atcaagaatc ctgagaggaa cctgcccctc tccattggca tctccatgcc cattgtcacc 960 atcatctata tcttgaccaa tgtggcctat tatactgtgc tagacatgag agacatcttg 1020 gccagtgatg ctgttgctgt gacttttgca gatcagatat ttggaatatt taactggata 1080 attccactgt cagttgcatt atcctgtttt ggtggcctca atgcctccat tgtggctgct 1140 tctaggcttt tctttgtggg ctcaagagaa ggccatctcc ctgatgccat ctgcatgatc 1200 catgttgagc ggttcacacc agtgccttct ctgctcttca atggtatcat ggcattgatc 1260 tacttgtgcg tggaagacat cttccagctc attaactact acagcttcag ctactggttc 1320 tttgtggggc tttctattgt gggtcagctt tatctgcgct ggaaggagcc tgatcgacct 1380 cgtcccctca agctcagcgt tttcttcccg attgtcttct gcctctgcac catcttcctg 1440 gtggctgttc cactttacag tgatactatc aactccctca tcggcattgc cattgccctc 1500 tcaggcctgc ccttttactt cctcatcatc agagtgccag aacataagcg accgctttac 1560 ctccgaagga tcgtggggtc tgccacaagg tacctccagg tcctgtgtat gtcagttgct 1620 gcagaaatgg atttggaaga tggaggagag atgcccaagc aacgggatcc caagtctaac 1680 taaacaccat ctggaatcct gatgtggaaa gcaggggttt ctggtctact ggctagagct 1740 aaggaagttg aaaaggaaag ctcacttctt tggaggcacc tgtccagaag cctggcctag 1800 gcagcttcaa cctttgaact tactttttga aatgaaaagt aatttatttg ttttgctaca 1860 tactgttcca gacttttaaa ggggacaatg aaggtgactg tggggaggag catgtcaggt 1920 ttgggcttgg ttgttttaga agcacctggg tgtgcctacc tactcctctt ttcttttaaa 1980 agggcccaca atgctccaat ttcctgtctc ctttagagag acatgaaact atcacaggtg 2040 ctggatgaca ataaaagttt atgttcctaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2098 12 1582 DNA Homo sapiens 12 ccaccatggt tgacagcact gagtatgaag tggcctccca gcctgaggtg gaaacctccc 60 ctttgggtga tggggccagc ccagggccgg agcaggtgaa gctgaagaag gagatctcac 120 tgcttaacgg cgtgtgcctg attgtgggga acatgatcgg ctcaggcatc tttgtttccc 180 ccaagggtgt gctcatatac agtgcctcct ttggtctctc tctggtcatc tgggctgtcg 240 ggggcctctt ctccgtcttt ggggcccttt gttatgcgga actgggcacc accattaaga 300 aatctggggc cagctatgcc tatatcctgg aggcctttgg aggattcctt gctttcatca 360 gactctggac ctccctgctc atcattgagc ccaccagcca ggccatcatt gccatcacct 420 ttgccaacta catggtacag cctctcttcc cgagctgctt cgccccttat gctgccagcc 480 gcctgctggc tgctgcctgc atctgtctct taaccttcat taactgtgcc tatgtcaaat 540 ggggaaccct ggtacaagat attttcacct atgctaaagt attggcactg atcgcggtca 600 tcgttgcagg cattgttaga cttggccagg gagcctctac tcattttgag aattcctttg 660 agggttcatc atttgcagtg ggtgacattg ccctggcact gtactcagct ctgttctcct 720 actcaggctg ggacaccctc aactatgtca ctgaagagat caagaatcct gagaggaacc 780 tgcccctctc cattggcatc tccatgccca ttgtcaccat catctatatc ttgaccaatg 840 tggcctatta tactgtgcta gacatgagag acatcttggc cagtgatgct gttgctgtga 900 cttttgcaga tcagatattt ggaatattta actggataat tccactgtca gttgcattat 960 cctgttttgg tggcctcaat gcctccattg tggctgcttc taggcttttc tttgtgggct 1020 caagagaagg ccatctccct gatgccatct gcatgatcca tgttgagcgg ttcacaccag 1080 tgccttctct gctcttcaat ggtatcatgg cattgatcta cttgtgcgtg gaagacatct 1140 tccagctcat taactactac agcttcagct actggttctt tgtggggctt tctattgtgg 1200 gtcagcttta tctgcgctgg aaggagcctg atcgacctcg tcccctcaag ctcagcgttt 1260 tcttcccgat tgtcttctgc ctctgcacca tcttcctggt ggctgttcca ctttacagtg 1320 atactatcaa ctccctcatc ggcattgcca ttgccctctc aggcctgccc ttttacttcc 1380 tcatcatcag agtgccagaa cataagcgac cgctttacct ccgaaggatc gtggggtctg 1440 ccacaaggta cctccaggtc ctgtgtatgt cagttgctgc agaaatggat ttggaagatg 1500 gaggagagat gcccaagcaa cgggatccca agtctaacta aacaccatct ggaatcctga 1560 tgtggaaagc aggggtttct gg 1582 13 3717 DNA Mus musculus 13 tccatttaaa aacacggccc ttgtttgtta ggcagtgaca cgtgtgactc tcatttgctg 60 tcacttttag agggtaggag cgcactctgt tttcggaaaa gggcgttcta attggaaact 120 ctgtttttcc acatatctat acgtttacac gcacgagcgt ttagaaaaag actgctcttt 180 ttcgctttgg gtggtcggaa ccctacggag aaagatggaa aagggagccc gccagcgaaa 240 caacaccgcg aagaaccacc cgggttctga caccagccct gaggccgagg ctagctcggg 300 agggggcgga gtagccctga agaaagagat cggattggtc agcgcctgtg gtatcattgt 360 agggaacatc attggctccg gaatctttgt ctcaccgaaa ggtgtgctgg aaaacgcggg 420 ctctgtgggc cttgctctca ttgtctggat cgtgacaggc atcatcacag ccgtgggagc 480 tctctgctat gctgagctag gcgtcaccat ccctaagtct ggaggtgatt actcttatgt 540 gaaggacatc ttcggaggac tggctgggtt cctgcggctg tggattgctg tgctggtgat 600 ctaccccacc aaccaagctg tcatcgccct caccttctcc aactacgtgc tgcagccgct 660 cttccctacc tgcttccccc ctgagtccgg tctgcgactc ctggctgcca tctgtttgtt 720 gctcctcaca tgggtcaact gctccagtgt ccgatgggcc acccgggttc aagatatctt 780 cacagctggg aagctcctgg ccctggctct gatcatcatc atgggtattg tgcagatatg 840 caaaggagaa ttcttctggc tggagccaaa gaatgcattt gagaatttcc aagaacctga 900 catcggcctc gttgctctgg cattcctcca gggctccttt gcctatggtg gctggaactt 960 ccttaattat gtgactgagg aacttgtgga tccttacaag aaccttccta gagccatctt 1020 catctccatc ccactggtca catttgtgta cgtctttgct aatattgcat acgtcactgc 1080 aatgtccccc caggagctgc tagcctccaa tgcagttgct gtgacttttg gagagaaact 1140 cctcggggtc atggcctgga tcatgcccat ttctgttgcc ctgtccacgt ttggtggagt 1200 caatggctcc ctcttcacct cctcccggct gttctttgct ggagccagag aaggccacct 1260 tcccagtgtg ttggccatga tccacgtgaa gcgctgcact ccaatcccag ccctgctctt 1320 cacatgcctc tccaccctgc tgatgctggt caccagtgac atgtacacac tcatcaacta 1380 tgtgggcttc atcaactacc tcttctatgg ggtaacggtt gcaggacaga tagtccttcg 1440 ctggaagaag cctgacattc cccgccccat caaggtcagc ctgctgtttc ccattatcta 1500 cctgctgttc tgggccttcc tgctgatctt cagcctgtgg tcagagccag tggtgtgtgg 1560 cattggcctg gccattatgc tgacaggagt tcctgtctac ttcctgggtg tctactggca 1620 acacaaaccc aagtgtttca atgacttcat taagtcccta accctagtga gtcagaagat 1680 gtgtgtggtc gtgtatcccc aggaggggaa ctcgggggct gaggaaacaa ctgatgactt 1740 agaggagcaa cacaagccca tcttcaagcc tactcctgtc aaggacccgg attcggagga 1800 gcagccctga agactgccag cctgtaactg gccactcctc ccttcatcct ttctgccctg 1860 tattcctgcc taggtccccc caacacacac acacacacac acacacacac acacacacac 1920 acacacttct gtaaggcagg ggccagacct gggtgtccac agtgagacat tctaaacaaa 1980 gaccctgacc tttgtaccca aagaaccaac ctgcttccag gaccgaggcc catggtcaag 2040 gtcaatgcac cagggcctgc acacacccta aggtttggag gagagtggag gtgccattag 2100 taccctacag agcccttcct cctgggccca tgcctgttag gtgcctctaa gaaacctggg 2160 ttcactacta tttcttctcc ctaaccctga gcccaggcga aacttccatt ggggacaaaa 2220 ggtgcagccc cagcaatagc taggaagaca actcagattg tcaccgttaa gtcagctgga 2280 gagactcaga tgtggcttca ttctccaggg aaccaaaagg cagagggtac acttctcttt 2340 gccttcttct ccactctgta ctaccaagac tccactgccc agcctcaacc cactacccag 2400 agagttccct ttcaggaagc ctccacccca ctgagctaag gaccaagaag gaggactgcc 2460 ccacccccag ctctaagcca agcttgtgga atatacaggg aatgctcaga ctgcagaagc 2520 caagccctgc cccacccctg ccgctgcctc ctatcccctt gtggtgccaa agctccctga 2580 agccggaaag gcttctattt agtctaatag acatctctct cttaagggca aagtgactct 2640 gcccctccct gtccccctgt gaaagagcag tcaccctctt atcccagcag ctcttcccac 2700 ccacagctct gcctaccacc ctcacccctt tcctctcctg gaccacggct caaggaccag 2760 gtatgaagga tctccaagtc cttcaggcct gaattagcat caaaccagcc ttaagtcatc 2820 tcccatccaa gaactgggcc taaagtgctc ccagtttcca acctccagga cagacctgat 2880 attaagtcct tccctggaag gaagggtgct tcccccttcc cagcagtgct cagccctcac 2940 cctggggaga ccatgttgtt ggggggcacg ggggagtgag gagaatgttt tccgaagcct 3000 gttcttctcc cctcccccaa ctcaaacaac cctaccaccg ttctgccatc ccacctggga 3060 caagaggaca gaatggcact gactttggag accttagaaa aaccccttcc aggagagctt 3120 ctggcctgtg ctctacagga aggtgacagt gtctgccttc tcctggcagc ctccacacct 3180 gctgtgctta ggacaggatc ctttgatcac aaggcagaac cctaaccccg gtccatctac 3240 tcaaactgga atatgcaggg acctttccct ggcccctgcc agcaggtctc ccaacttcct 3300 acctagctca ccctacccct aactggtgag aggctgagtt tgggctgaat tatatgccct 3360 taggaataca gtccaggcag gctgtggggc ctctgggggg ccagggacca cccgctaccc 3420 ctcattcctc tcattccttc agcctgtacc tgcccctcct tgaattattg tgtgtctctc 3480 tctctctttt tttttaagtg gatgccttac tttttggata actatttttg aagctggtat 3540 ttctatttct tttggatttt ttaatgtatg gcggtttggg ggcagagcta gaaccttaat 3600 aatttgtcac tgagttcatt taggttttaa attttattgg tttgttttat ggagtatttt 3660 tttttcatgt aataaaattt taaatggaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 3717 14 3728 DNA Homo sapiens 14 agcaacactt ttcttgtttg taaacgcgag tgaccagaaa gtgtgaatgc ggagtaggaa 60 tatttttcgt gttctctttt atctgcttgc cttttttaga gagtagcagt ggttcctatt 120 tcggaaaagg acgttctaat tcaaagctct ctcccaatat atttacacga atacgcattt 180 agaaagggag gcagcttttg aggttgcaat cctactgaga aggatggaag aaggagccag 240 gcaccgaaac aacaccgaaa agaaacaccc aggtgggggc gagtcggacg ccagccccga 300 ggctggttcc ggagggggcg gagtagccct gaagaaagag atcggattgg tcagtgcctg 360 tggtatcatc gtagggaaca tcatcggctc tggaatcttt gtctcgccaa agggagtgct 420 ggagaatgct ggttctgtgg gccttgctct catcgtctgg attgtgacgg gcttcatcac 480 agttgtggga gccctctgct atgctgaact cggggtcacc atccccaaat ctggaggtga 540 ctactcctat gtcaaggaca tcttcggagg actggctggg ttcctgaggc tgtggattgc 600 tgtgctggtg atctacccca ccaaccaggc tgtcatcgcc ctcaccttct ccaactacgt 660 gctgcagccg ctcttcccca cctgcttccc cccagagtct ggccttcggc tcctggctgc 720 catctgctta ttgctcctca catgggtcaa ctgttccagt gtgcggtggg ccacccgggt 780 tcaagacatc ttcacagctg ggaagctcct ggccttggcc ctgattatca tcatggggat 840 tgtacagata tgcaaaggag agtacttctg gctggagcca aagaatgcat ttgagaattt 900 ccaggaacct gacatcggcc tcgtcgcact ggctttcctt cagggctcct ttgcctatgg 960 aggctggaac tttctgaatt acgtgactga ggagcttgtt gatccctaca agaaccttcc 1020 cagagccatc ttcatctcca tcccactggt cacatttgtg tatgtctttg ccaatgtcgc 1080 ttatgtcact gcaatgtccc cccaggagct gctggcatcc aacgccgtcg ctgtgacttt 1140 tggagagaag ctcctaggag tcatggcctg gatcatgccc atttctgttg ccctgtccac 1200 atttggagga gttaatgggt ctctcttcac ctcctctcgg ctgttcttcg ctggagcccg 1260 agagggccac cttcccagtg tgttggccat gatccacgtg aagcgctgca ccccaatccc 1320 agccctgctc ttcacatgca tctccaccct gctgatgctg gtcaccagcg acatgtacac 1380 actcatcaac tacgtgggct tcatcaacta cctcttctat ggggtcacgg ttgctggaca 1440 gatagtcctt cgctggaaga agcctgatat cccccgcccc atcaagatca acctgctgtt 1500 ccccatcatc tacttgctgt tctgggcctt cctgctggtc ttcagcctgt ggtcagagcc 1560 ggtggtgtgt ggcattggcc tggccatcat gctgacagga gtgcctgtct atttcctggg 1620 tgtttactgg caacacaagc ccaagtgttt cagtgacttc attgagctgc taaccctggt 1680 gagccagaag atgtgtgtgg tcgtgtaccc cgaggtggag cggggctcag ggacagagga 1740 ggctaatgag gacatggagg agcagcagca gcccatgtac caacccactc ccacgaagga 1800 caaggacgtg gcggggcagc cccagccctg aggaccacca ttccctggct actctctcct 1860 tcctccccct tttatcctac ctccctgcct tggtcccgcc aacacatgcg agtacacaca 1920 cacccctctc tctgcttttg tcaggcagtg gtaggacttt ggtgtgggtg gtgagaaatt 1980 gtaaacaaaa actgacattc atacccaaag aaccagcctc tcaccccagg gtccatgtcc 2040 caggccccac tccagtgctg cccacactcc cagctgctgg aggagagggg agatgccaag 2100 gtgccctgca ggacctcctc cgggccacac cctcagctgc ctcttcagga accggagctc 2160 attactgcct tccctcccag ggaggcccct tcagagagga gaggccacag gagctgcatt 2220 gtggggggac aggctcaagc aattctgtcc ccatcaaggg gtcagctgga gagacccaag 2280 accctatctg ttcaccaggg acccaaaatc caaggggatg cttccctctg ccctctttcc 2340 tgcccctccc catcatacct gcacccaccc cagccagggc tccctgtcca gaattcggtt 2400 ctcctcagga cgccaactcc cagagctaag gaccaaggag aagaacagcc tctccacccc 2460 caagccaggc ggttgaggaa catattgaga aaggttcaga ttgcagaaac ccagccctgc 2520 ccctgcctcc tgcatccagc ccccaacatg gtgccaaagc ttccagaagc caaaaagctt 2580 ctgattttta aggtagtggg catctctctt cctaatgacg aagctgctca gcaactccac 2640 ctgcccgccg caggaaggag cagtcccctg ctatccctgc agccactccc agcacacccg 2700 cacacagcca gcaccaccgc ccccaccgtg cacttctcct ctctgggcct tggcttggga 2760 ccaggtacga aggatcccca agcccttcag gcccgagatc agagccagat cagccttaag 2820 tcacctccca tccaagaact tggcctaaaa atactcccct atttctaacc ctcaggacgg 2880 atctgatatt aaatgccttc cctgggagga agggtgcttt ccccctccct agaggtgccc 2940 attccatacc ctgggagact gaggagagca ttggctgaag cccagttcct ttcccatcca 3000 tccccaactc caataatccc ccactcctcg caggtctcag tgtcatgctg tcttggggca 3060 gggtgaaagg gtagtggcag cagggcgccc actctggaga tcctcaaaaa aggccctcct 3120 ctgtggctgg cagcctctga cctttccctg ggcttcaaag gaaggctatg gagtttgctg 3180 tgggccctgc aaccttccca gccactcctg ctgcactaag gacttaggat ccttttatca 3240 caaatcggga ttctctcccc caccccgaat tctgtctgct taaactggaa tacacaggag 3300 cccttcctgg cctggatggt gtctcccagc ttccccgccc agcttgccca ccccatagtt 3360 ggtgagatgc caagtttggt ctgagttgtg accccttcag agtagatgcc cggcaggctg 3420 gggttggccc ctggagggtc aggggaccat cttcttattc cctcttttct cattcctcca 3480 acttcctccc ctccttcaat tatttttttg taaagttgat gccttacttt ttggataaat 3540 atttttgaag ctggtatttc tatttctttt ggattttttt taatgtaagg ttgttttggg 3600 ggatggagtt agaaccttaa tgataatttc tttcgtttgg tgtaggtttt agagatttgt 3660 tttgtggaga ggtttttttc ttttgatgta ataaaattta aaatggaaaa aaaaaaaaaa 3720 aaaaaaaa 3728 15 4237 DNA Homo sapiens 15 tcccgaaacc agagggatgg ggccggctgt gcagtagaac ggggatcgaa aagaggaaaa 60 caagggcacg aagaccagcg agaaagaaga ggacacctgg gaaaggcgga agcagaagac 120 ggggaaggga aaagaaaccc atagcaggtg gaaaccagat ctagagcaac accgtcaggt 180 tcacagtttg tttttctaga agagaagaaa gtacctgagg attgctcttt tttcctaccg 240 ttaatgaaaa ctacttttgt cttcatcata aaagaaaaaa ctaaggggag gtaaaggcag 300 tctcctgttt tattaggggg agaggtgaag ggaaatccag gctcactttc tgaataagcc 360 actgcctggt gcacagagca gaaccatcct ggtttctgaa gacacatccc tttcagcaga 420 attccagccg gagtcgctgg cacagttcta tttttatatt taaatgtatg tctcccctgg 480 cctttttttt tttttttttt tttttttagc aacacttttc ttgtttgtaa acgcgagtga 540 ccagaaagtg tgaatgcgga gtaggaatat ttttcgtgtt ctcttttatc tgcttgcctt 600 ttttagagag tagcagtggt tcctatttcg gaaaaggacg ttctaattca aagctctctc 660 ccaatatatt tacacgaata cgcatttaga aagggaggca gcttttgagg ttgcaatcct 720 actgagaagg atggaagaag gagccaggca ccgaaacaac accgaaaaga aacacccagg 780 tgggggcgag tcggacgcca gccccgaggc tggttccgga gggggcggag tagccctgaa 840 gaaagagatc ggattggtca gtgcctgtgg tatcatcgta gggaacatca tcggctctgg 900 aatctttgtc tcgccaaagg gagtgctgga gaatgctggt tctgtgggcc ttgctctcat 960 cgtctggatt gtgacgggct tcatcacagt tgtgggagcc ctctgctatg ctgaactcgg 1020 ggtcaccatc cccaaatctg gaggtgacta ctcctatgtc aaggacatct tcggaggact 1080 ggctgggttc ctgaggctgt ggattgctgt gctggtgatc taccccacca accaggctgt 1140 catcgccctc accttctcca actacgtgct gcagccgctc ttccccacct gcttcccccc 1200 agagtctggc cttcggctcc tggctgccat ctgcttattg ctcctcacat gggtcaactg 1260 ttccagtgtg cggtgggcca cccgggttca agacatcttc acagctggga agctcctggc 1320 cttggccctg attatcatca tggggattgt acagatatgc aaaggagagt acttctggct 1380 ggagccaaag aatgcatttg aggatttcca ggaacctgac atcggcctcg tcgcactggc 1440 tttccttcag ggctcctttg cctatggagg ctggaacttt ctgaattacg tgactgagga 1500 gcttgttgat ccctacaaga accttcccag agccatcttc atctccatcc cactggtcac 1560 atttgtgtat gtctttgcca atgtcgctta tgtcactgca atgtcccccc aggagctgct 1620 ggcatccaac gccgtcgctg tgacttttgg agagaagctc ctaggagtca tggcctggat 1680 catgcccatt tctgttgccc tgtccacatt tggaggagtt aatgggtctc tcttcacctc 1740 ctctcggctg ttcttcgctg gagcccgaga gggccacctt cccagtgtgt tggccatgat 1800 ccacgtgaag cgctgcaccc caatcccagc cctgctcttc acatgcatct ccaccctgct 1860 gatgctggtc accagcgaca tgtacacact catcaactac gtgggcttca tcaactacct 1920 cttctatggg ggcacggttg ctggacagat agtccttcgc tggaagaagc ctgatatccc 1980 ccgccccatc aagatcaacc tgctgttccc catcatctac ttgctgttct gggccttcct 2040 gctggtcttc agcctgtggt cagagccggt ggtgtgtggc attggcctgg ccatcatgct 2100 gacaggagtg cctgtctatt tcctgggtgt ttactggcaa cacaagccca agtgtttcag 2160 tgacttcatt gagctgctaa ccctggtgag ccagaagatg tgtgtggtcg tgtaccccga 2220 ggtggagcgg ggctcaggga cagaggaggc taatgaggac atggaggagc agcagcagcc 2280 catgtaccaa cccactccca cgaaggacaa ggacgtggcg gggcagcccc agccctgagg 2340 accaccattc cctggctact ctctccttcc tccccctttt atcctacctc cctgccttgg 2400 tcccgccaac acatgcgagt acacacacac ccctctctct gcttttgtca ggcagtggta 2460 ggactttggt gtgggtggtg agaaattgta aacaaaaact gacattcata cccaaagaac 2520 cagcctctca ccccagggtc catgtcccag gccccactcc agtgctgccc acactcccag 2580 ctgctggagg agaggggaga tgccaaggtg ccctgcagga cctccctccg ggccacaccc 2640 tcagctgcct cttcaggaac cggagctcat tactgccttc cctcccaggg aggccccttc 2700 agagaggaga ggccaggagc tgcattgtgg ggggacaggc tcaagcaatt ctgtccccat 2760 caaggggtca gctggagaga cccaagaccc tatctgttca ccagggaccc aaaatccaag 2820 gggatgcttc cctctgccct ctttcctgcc cctccccatc atacctgcac ccaccccagc 2880 cagggctccc tgtccagaat tcggttctcc tcaggacgcc aactcccaga gctaaggacc 2940 aaggagaaga acagcctctc cacccccaag ccaggcggtt gaggaacata ttgagaaagg 3000 ttcagattgc agaaacccag ccctgcccct gcctcctgca tccagccccc aacatggtgc 3060 caaagcttcc agaagccaaa aagcttctga tttttaaggt agtgggcatc tctctcctaa 3120 tgacgaagct gctcagcaac tccacctgcc cgccgcagga aggagcagtc ccctgctatc 3180 cctgcagcca ctcccagcac acccgcacac agccagcacc accgccccac cgtgcacttc 3240 tcctctctgg gccttggctt gggaccaggt acgaaggatc cccaagccct tcaggcctga 3300 gatcagagcc agatcagcct taagtcacct cccatccaag aacttggcct aaaaatactc 3360 ccctatttct aaccctcagg acggatctga tattaaatgc cttccctggg aggaagggtg 3420 ctttccccct ccctagaggt gcccattcca taccctggga gactgaggag agcattggct 3480 gaagcccagt tcctttccca tccatcccca actccaataa tcccccactc ctcgcaggtc 3540 tcagtgtcat gctgtcttgg ggcagggtga aagggtagtg gcagcagggc gcccactctg 3600 gagatcctca aaaaaggccc tcctctgtgg ctggcagcct ctgacctttc cctgggcttc 3660 aaaggaaggc tatggagttt gctgtgggcc ctgcaacctt cccagccact cctgctgcac 3720 taaggactta ggatcctttt atcacaaatc gggattctct cccccacccc gaattctgtc 3780 tgcttaaact ggaatacaca ggagcccttc ctggcctgga tggtgtctcc cagcttcccc 3840 gcccagcttg cccaccccat agttggtgag atgccaagtt tggtctgagt tgtgacccct 3900 tcagagtaga tgcccggcag gctggggttg gcccctggag ggtcagggga ccatcttctt 3960 attccctctt ttctcattcc tccaacttcc tcccctcctt caattatttt tttgtaaagt 4020 tgatgcctta ctttttggat aaatattttt gaagctggta tttctatttc ttttggattt 4080 tttttaatgt aaggttgttt tgggggatgg agttagaacc ttaatgataa tttctttcgt 4140 ttggtgtagg ttttagagat ttgttttgtg gagaggtttt tttcttttga tgtaataaaa 4200 tttaaaatgg aaatgaaaaa aaaaaaaaaa aaaaaaa 4237 16 1657 DNA Homo sapiens 16 gcttttgagg ttgcaatcct actgagaagg atggaagaag gagccaggca ccgaaacaac 60 accgaaaaga aacacccagg tgggggcgag tcggacgcca gccccgaggc tggttccgga 120 gggggcggag tagccctgaa gaaagagatc ggattggtca gtgcctgtgg tatcatcgta 180 gggaacatca tcggctctgg aatctttgtc tcgccaaagg gagtgctgga gaatgctggt 240 tctgtgggcc ttgctctcat cgtctggatt gtgacgggct tcatcacagt tgtgggagcc 300 ctctgctatg ctgaactcgg ggtcaccatc cccaaatctg gaggtgacta ctcctatgtc 360 aaggacatct tcggaggact ggctgggttc ctgaggctgt ggattgctgt gctggtgatc 420 taccccacca accaggctgt catcgccctc accttctcca actacgtgct gcagccgctc 480 ttccccacct gcttcccccc agagtctggc cttcggctcc tggctgccat ctgcttattg 540 ctcctcacat gggtcaactg ttccagtgtg cggtgggcca cccgggttca agacatcttc 600 acagctggga agctcctggc cttggccctg attatcatca tggggattgt acagatatgc 660 aaaggagagt acttctggct ggagccaaag aatgcatttg agaatttcca ggaacctgac 720 atcggcctcg tcgcactggc tttccttcag ggctcctttg cctatggagg ctggaacttt 780 ctgaattacg tgactgagga gcttgttgat ccctacaaga accttcccag agccatcttc 840 atctccatcc cactggtcac atttgtgtat gtctttgcca atgtcgctta tgtcactgca 900 atgtcccccc aggagctgct ggcatccaac gccgtcgctg tgacttttgg agagaagctc 960 ctaggagtca tggcctggat catgcccatt tctgttgccc tgtccacatt tggaggagtt 1020 aatgggtctc tcttcacctc ctctcggctg ttcttcgctg gagcccgaga gggccacctt 1080 cccagtgtgt tggccatgat ccacgtgaag cgctgcaccc caatcccagc cctgctcttc 1140 acatgcatct ccaccctgct gatgctggtc accagcgaca tgtacacact catcaactat 1200 gtgggcttca tcaactacct cttctatggg gtcacggttg ctggacagat agtccttcgc 1260 tggaagaagc ctgatatccc ccgccccatc aagatcaacc tgctgttccc catcatctac 1320 ttgctgttct gggccttcct gctggtcttc agcctgtggt cagagccggt ggtgtgtggc 1380 attggcctgg ccatcatgct gacaggagtg cctgtctatt tcctgggtgt ttactggcaa 1440 cacaagccca agtgtttcag tgacttcatt gagctgctaa ccctggtgag ccagaagatg 1500 tgtgtggtcg tgtaccccga ggtggagcgg ggctcaggga cagaggaggc taatgaggac 1560 atggaggagc agcagcagcc catgtaccaa cccactccca cgaaggacaa ggacgtggcg 1620 gggcagcccc agccctgagg accaccattc cctggct 1657 17 2869 DNA Homo sapiens 17 ttaattccgt tttcgggcca agaattcggc acgaggattt ccaggaacct gacatcggcc 60 tcgtcgcact ggctttcctt cagggctcct ttgcctatgg aggctggaac tttctgaatt 120 acgtgactga ggagcttgtt gatccctaca agaaccttcc cagagccatc ttcatctcca 180 tcccactggt cacatttgtg tatgtctttg ccaatgtcgc ttatgtcact gcaatgtccc 240 cccaggagct gctggcatcc aacgccgtcg ctgtgacttt tggagagaag ctcctaggag 300 tcatggcctg gatcatgccc atttctgttg ccctgtccac atttggagga gttaatgggt 360 ctctcttcac ctcctctcgg ctgttcttcg ctggagcccg agagggccac cttcccagtg 420 tgttggccat gatccacgtg aagcgctgca ccccaatccc agccctgctc ttcacatgca 480 tctccaccct gctgatgctg gtcaccagcg acatgtacac actcatcaac tacgtgggct 540 tcatcaacta cctcttctat gggggcacgg ttgctggaca gatagtcctt cgctggaaga 600 agcctgatat cccccgcccc atcaagatca acctgctgtt ccccatcatc tacttgctgt 660 tctgggcctt cctgctggtc ttcagcctgt ggtcagagcc ggtggtgtgt ggcattggcc 720 tggccatcat gctgacagga gtgcctgtct atttcctggg tgtttactgg caacacaagc 780 ccaagtgttt cagtgacttc attgagctgc taaccctggt gagccagaag atgtgtgtgg 840 tcgtgtaccc cgaggtggag cggggctcag ggacagagga ggctaatgag gacatggagg 900 agcagcagca gcccatgtac caacccactc ccacgaagga caaggacgtg gcggggcagc 960 cccagccctg aggaccacca ttccctggct actctctcct tcctccccct tttatcctac 1020 ctccctgcct tggtcccgcc aacacatgcg agtacacaca cacccctctc tctgcttttg 1080 tcaggcagtg gtaggacttt ggtgtgggtg gtgagaaatt gtaaacaaaa actgacattc 1140 atacccaaag aaccagcctc tcaccccagg gtccatgtcc caggccccac tccagtgctg 1200 cccacactcc cagctgctgg aggagagggg agatgccaag gtgccctgca ggacctccct 1260 ccgggccaca ccctcagctg cctcttcagg aaccggagct cattactgcc ttccctccca 1320 gggaggcccc ttcagagagg agaggccagg agctgcattg tggggggaca ggctcaagca 1380 attctgtccc catcaagggg tcagctggag agacccaaga ccctatctgt tcaccaggga 1440 cccaaaatcc aaggggatgc ttccctctgc cctctttcct gcccctcccc atcatacctg 1500 cacccacccc agccagggct ccctgtccag aattcggttc tcctcaggac gccaactccc 1560 agagctaagg accaaggaga agaacagcct ctccaccccc aagccaggcg gttgaggaac 1620 atattgagaa aggttcagat tgcagaaacc cagccctgcc cctgcctcct gcatccagcc 1680 cccaacatgg tgccaaagct tccagaagcc aaaaagcttc tgatttttaa ggtagtgggc 1740 atctctctcc taatgacgaa gctgctcagc aactccacct gcccgccgca ggaaggagca 1800 gtcccctgct atccctgcag ccactcccag cacacccgca cacagccagc accaccgccc 1860 caccgtgcac ttctcctctc tgggccttgg cttgggacca ggtacgaagg atccccaagc 1920 ccttcaggcc tgagatcaga gccagatcag ccttaagtca cctcccatcc aagaacttgg 1980 cctaaaaata ctcccctatt tctaaccctc aggacggatc tgatattaaa tgccttccct 2040 gggaggaagg gtgctttccc cctccctaga ggtgcccatt ccataccctg ggagactgag 2100 gagagcattg gctgaagccc agttcctttc ccatccatcc ccaactccaa taatccccca 2160 ctcctcgcag gtctcagtgt catgctgtct tggggcaggg tgaaagggta gtggcagcag 2220 ggcgcccact ctggagatcc tcaaaaaagg ccctcctctg tggctggcag cctctgacct 2280 ttccctgggc ttcaaaggaa ggctatggag tttgctgtgg gccctgcaac cttcccagcc 2340 actcctgctg cactaaggac ttaggatcct tttatcacaa atcgggattc tctcccccac 2400 cccgaattct gtctgcttaa actggaatac acaggagccc ttcctggcct ggatggtgtc 2460 tcccagcttc cccgcccagc ttgcccaccc catagttggt gagatgccaa gtttggtctg 2520 agttgtgacc ccttcagagt agatgcccgg caggctgggg ttggcccctg gagggtcagg 2580 ggaccatctt cttattccct cttttctcat tcctccaact tcctcccctc cttcaattat 2640 ttttttgtaa agttgatgcc ttactttttg gataaatatt tttgaagctg gtatttctat 2700 ttcttttgga ttttttttaa tgtaaggttg ttttggggga tggagttaga accttaatga 2760 taatttcttt cgtttggtgt aggttttaga gatttgtttt gtggagaggt ttttttcttt 2820 tgatgtaata aaatttaaaa tggaaatgaa aaaaaaaaaa aaaaaaaaa 2869 18 2568 DNA Homo sapiens 18 atggcctgga tcatgcccat ttctgttgcc ctgtccacat ttggaggagt taatgggtct 60 ctcttcacct cctctcggct gttcttcgct ggagcccgag agggccacct tcccagtgtg 120 ttggccatga tccacgtgaa gcgctgcacc ccaatcccag ccctgctctt cacatgcatc 180 tccaccctgc tgatgctggt caccagcgac atgtacacac tcatcaacta cgtgggcttc 240 atcaactacc tcttctatgg gggcacggtt gctggacaga tagtccttcg ctggaagaag 300 cctgatatcc cccgccccat caagatcaac ctgctgttcc ccatcatcta cttgctgttc 360 tgggccttcc tgctggtctt cagcctgtgg tcagagccgg tggtgtgtgg cattggcctg 420 gccatcatgc tgacaggagt gcctgtctat ttcctgggtg tttactggca acacaagccc 480 aagtgtttca gtgacttcat tgagctgcta accctggtga gccagaagat gtgtgtggtc 540 gtgtaccccg aggtggagcg gggctcaggg acagaggagg ctaatgagga catggaggag 600 cagcagcagc ccatgtacca acccactccc acgaaggaca aggacgtggc ggggcagccc 660 cagccctgag gaccaccatt ccctggctac tctctccttc ctcccccttt tatcctacct 720 ccctgccttg gtcccgccaa cacatgcgag tacacacaca cccctctctc tgcttttgtc 780 aggcagtggt aggactttgg tgtgggtggt gagaaattgt aaacaaaaac tgacattcat 840 acccaaagaa ccagcctctc accccagggt ccatgtccca ggccccactc cagtgctgcc 900 cacactccca gctgctggag gagaggggag atgccaaggt gccctgcagg acctccctcc 960 gggccacacc ctcagctgcc tcttcaggaa ccggagctca ttactgcctt ccctcccagg 1020 gaggcccctt cagagaggag aggccaggag ctgcattgtg gggggacagg ctcaagcaat 1080 tctgtcccca tcaaggggtc agctggagag acccaagacc ctatctgttc accagggacc 1140 caaaatccaa ggggatgctt ccctctgccc tctttcctgc ccctccccat catacctgca 1200 cccaccccag ccagggctcc ctgtccagaa ttcggttctc ctcaggacgc caactcccag 1260 agctaaggac caaggagaag aacagcctct ccacccccaa gccaggcggt tgaggaacat 1320 attgagaaag gttcagattg cagaaaccca gccctgcccc tgcctcctgc atccagcccc 1380 caacatggtg ccaaagcttc cagaagccaa aaagcttctg atttttaagg tagtgggcat 1440 ctctctccta atgacgaagc tgctcagcaa ctccacctgc ccgccgcagg aaggagcagt 1500 cccctgctat ccctgcagcc actcccagca cacccgcaca cagccagcac caccgcccca 1560 ccgtgcactt ctcctctctg ggccttggct tgggaccagg tacgaaggat ccccaagccc 1620 ttcaggcctg agatcagagc cagatcagcc ttaagtcacc tcccatccaa gaacttggcc 1680 taaaaatact cccctatttc taaccctcag gacggatctg atattaaatg ccttccctgg 1740 gaggaagggt gctttccccc tccctagagg tgcccattcc ataccctggg agactgagga 1800 gagcattggc tgaagcccag ttcctttccc atccatcccc aactccaata atcccccact 1860 cctcgcaggt ctcagtgtca tgctgtcttg gggcagggtg aaagggtagt ggcagcaggg 1920 cgcccactct ggagatcctc aaaaaaggcc ctcctctgtg gctggcagcc tctgaccttt 1980 ccctgggctt caaaggaagg ctatggagtt tgctgtgggc cctgcaacct tcccagccac 2040 tcctgctgca ctaaggactt aggatccttt tatcacaaat cgggattctc tcccccaccc 2100 cgaattctgt ctgcttaaac tggaatacac aggagccctt cctggcctgg atggtgtctc 2160 ccagcttccc cgcccagctt gcccacccca tagttggtga gatgccaagt ttggtctgag 2220 ttgtgacccc ttcagagtag atgcccggca ggctggggtt ggcccctgga gggtcagggg 2280 accatcttct tattccctct tttctcattc ctccaacttc ctcccctcct tcaattattt 2340 ttttgtaaag ttgatgcctt actttttgga taaatatttt tgaagctggt atttctattt 2400 cttttggatt ttttttaatg taaggttgtt ttgggggatg gagttagaac cttaatgata 2460 atttctttcg tttggtgtag gttttagaga tttgttttgt ggagaggttt ttttcttttg 2520 atgtaataaa atttaaaatg gaaatgaaaa aaaaaaaaaa aaaaaaaa 2568 19 2839 DNA Homo sapiens 19 atttccagga acctgacatc ggcctcgtcg cactggcttt ccttcagggc tcctttgcct 60 atggaggctg gaactttctg aattacgtga ctgaggagct tgttgatccc tacaagaacc 120 ttcccagagc catcttcatc tccatcccac tggtcacatt tgtgtatgtc tttgccaatg 180 tcgcttatgt cactgcaatg tccccccagg agctgctggc atccaacgcc gtcgctgtga 240 cttttggaga gaagctccta ggagtcatgg cctggatcat gcccatttct gttgccctgt 300 ccacatttgg aggagttaat gggtctctct tcacctcctc tcggctgttc ttcgctggag 360 cccgagaggg ccaccttccc agtgtgttgg ccatgatcca cgtgaagcgc tgcaccccaa 420 tcccagccct gctcttcaca tgcatctcca ccctgctgat gctggtcacc agcgacatgt 480 acacactcat caactacgtg ggcttcatca actacctctt ctatggggtc acggttgctg 540 gacagatagt ccttcgctgg aagaagcctg atatcccccg ccccatcaag atcaacctgc 600 tgttccccat catctacttg ctgttctggg ccttcctgct ggtcttcagc ctgtggtcag 660 agccggtggt gtgtggcatt ggcctggcca tcatgctgac aggagtgcct gtctatttcc 720 tgggtgttta ctggcaacac aagcccaagt gtttcagtga cttcattgag ctgctaaccc 780 tggtgagcca gaagatgtgt gtggtcgtgt accccgaggt ggagcggggc tcaaggacag 840 aggaggctaa tgaggacatg gaggagcagc agcagcccat gtaccaaccc actcccacga 900 aggacaagga cgtggcgggg cagccccagc cctgaggacc accattccct ggctactctc 960 tccttcctcc cccttttatc ctacctccct gccttggtcc cgccaacaca tgcgagtaca 1020 cacacacccc tctctctgct tttgtcaggc agtggtagga ctttggtgtg ggtggtgaga 1080 aattgtaaac aaaaactgac attcataccc aaagaaccag cctctcaccc cagggtccat 1140 gtcccaggcc ccactccagt gctgcccaca ctcccagctg ctggaggaga ggggagatgc 1200 caaggtgccc tgcaggacct ccctccgggc cacaccctca gctgcctctt caggaaccgg 1260 agctcattac tgccttccct cccagggagg ccccttcaga gaggagaggc cacaggagct 1320 gcattgtggg gggacaggct caagcaattc tgtccccatc aaggggtcag ctggagagac 1380 ccaagaccct atctgttcac cagggaccca aaatccaagg ggatgcttcc ctctgccctc 1440 tttcctgccc ctccccatca tacctgcacc caccccagcc agggctccct gtccagaatt 1500 cggttctcct caggacgcca actcccagag ctaaggacca aggagaagaa cagcctctcc 1560 acccccaagc caggcggttg aggaacatat tgagaaaggt tcagattgca gaaacccagc 1620 cctgcccctg cctcctgcat ccagccccca acatggtgcc aaagcttcca gaagccaaaa 1680 agcttctgat ttttaaggta gtgggcatct ctcttcctaa tgacgaagct gctcagcaac 1740 tccacctgcc cgccgcagga aggagcagtc ccctgctatc cctgcagcca ctcccagcac 1800 acccgcacac agccagcacc accgccccca ccgtgcactt ctcctctctg ggccttggct 1860 tgggaccagg tacgaaggat ccccaagccc ttcaggcccg agatcagagc cagatcagcc 1920 ttaagtcacc tcccatccaa gaacttggcc taaaaatact cccctatttc taaccctcag 1980 gacggatctg atattaaatg ccttccctgg gaggaagggt gctttccccc tccctagagg 2040 tgcccattcc ataccctggg agactgagga gagcattggc tgaagcccag ttcctttccc 2100 atccatcccc aactccaata atcccccact cctcgcaggt ctcagtgtca tgctgtcttg 2160 gggcagggtg aaagggtagt ggcagcaggg cgcccactct ggagatcctc aaaaaaggcc 2220 ctcctctgtg gctggcagcc tctgaccttt ccctgggctt caaaggaagg ctatggagtt 2280 tgctgtgggc cctgcaacct tcccagccac tcctgctgca ctaaggactt aggatccttt 2340 tatcacaaat cgggattctc tcccccaccc cgaattctgt ctgcttaaac tggaatacac 2400 aggagccctt cctggcctgg atggtgtctc ccagcttccc cgcccagctt gcccacccca 2460 tagttggtga gatgccaagt ttggtctgag ttgtgacccc ttcagagtag atgcccggca 2520 ggctggggtt ggcccctgga gggtcagggg accatcttct tattccctct tttctcattc 2580 ctccaacttc ctcccctcct tcaattattt ttttgtaaag ttgatgcctt actttttgga 2640 taaatatttt tgaagctggt atttctattt cttttggatt ttttttaatg taaggttgtt 2700 ttgggggatg gagttagaac cttaatgata atttctttcg tttggtgtag gttttagaga 2760 tttgttttgt ggagaggttt ttttcttttg atgtaataaa atttaaaatg gaaatgaaaa 2820 aaaaaaaaaa aaaaaaaaa 2839 20 4237 DNA Homo sapiens 20 tcccgaaacc agagggatgg ggccggctgt gcagtagaac ggggatcgaa aagaggaaaa 60 caagggcacg aagaccagcg agaaagaaga ggacacctgg gaaaggcgga agcagaagac 120 ggggaaggga aaagaaaccc atagcaggtg gaaaccagat ctagagcaac accgtcaggt 180 tcacagtttg tttttctaga agagaagaaa gtacctgagg attgctcttt tttcctaccg 240 ttaatgaaaa ctacttttgt cttcatcata aaagaaaaaa ctaaggggag gtaaaggcag 300 tctcctgttt tattaggggg agaggtgaag ggaaatccag gctcactttc tgaataagcc 360 actgcctggt gcacagagca gaaccatcct ggtttctgaa gacacatccc tttcagcaga 420 attccagccg gagtcgctgg cacagttcta tttttatatt taaatgtatg tctcccctgg 480 cctttttttt tttttttttt tttttttagc aacacttttc ttgtttgtaa acgcgagtga 540 ccagaaagtg tgaatgcgga gtaggaatat ttttcgtgtt ctcttttatc tgcttgcctt 600 ttttagagag tagcagtggt tcctatttcg gaaaaggacg ttctaattca aagctctctc 660 ccaatatatt tacacgaata cgcatttaga aagggaggca gcttttgagg ttgcaatcct 720 actgagaagg atggaagaag gagccaggca ccgaaacaac accgaaaaga aacacccagg 780 tgggggcgag tcggacgcca gccccgaggc tggttccgga gggggcggag tagccctgaa 840 gaaagagatc ggattggtca gtgcctgtgg tatcatcgta gggaacatca tcggctctgg 900 aatctttgtc tcgccaaagg gagtgctgga gaatgctggt tctgtgggcc ttgctctcat 960 cgtctggatt gtgacgggct tcatcacagt tgtgggagcc ctctgctatg ctgaactcgg 1020 ggtcaccatc cccaaatctg gaggtgacta ctcctatgtc aaggacatct tcggaggact 1080 ggctgggttc ctgaggctgt ggattgctgt gctggtgatc taccccacca accaggctgt 1140 catcgccctc accttctcca actacgtgct gcagccgctc ttccccacct gcttcccccc 1200 agagtctggc cttcggctcc tggctgccat ctgcttattg ctcctcacat gggtcaactg 1260 ttccagtgtg cggtgggcca cccgggttca agacatcttc acagctggga agctcctggc 1320 cttggccctg attatcatca tggggattgt acagatatgc aaaggagagt acttctggct 1380 ggagccaaag aatgcatttg aggatttcca ggaacctgac atcggcctcg tcgcactggc 1440 tttccttcag ggctcctttg cctatggagg ctggaacttt ctgaattacg tgactgagga 1500 gcttgttgat ccctacaaga accttcccag agccatcttc atctccatcc cactggtcac 1560 atttgtgtat gtctttgcca atgtcgctta tgtcactgca atgtcccccc aggagctgct 1620 ggcatccaac gccgtcgctg tgacttttgg agagaagctc ctaggagtca tggcctggat 1680 catgcccatt tctgttgccc tgtccacatt tggaggagtt aatgggtctc tcttcacctc 1740 ctctcggctg ttcttcgctg gagcccgaga gggccacctt cccagtgtgt tggccatgat 1800 ccacgtgaag cgctgcaccc caatcccagc cctgctcttc acatgcatct ccaccctgct 1860 gatgctggtc accagcgaca tgtacacact catcaactac gtgggcttca tcaactacct 1920 cttctatggg ggcacggttg ctggacagat agtccttcgc tggaagaagc ctgatatccc 1980 ccgccccatc aagatcaacc tgctgttccc catcatctac ttgctgttct gggccttcct 2040 gctggtcttc agcctgtggt cagagccggt ggtgtgtggc attggcctgg ccatcatgct 2100 gacaggagtg cctgtctatt tcctgggtgt ttactggcaa cacaagccca agtgtttcag 2160 tgacttcatt gagctgctaa ccctggtgag ccagaagatg tgtgtggtcg tgtaccccga 2220 ggtggagcgg ggctcaggga cagaggaggc taatgaggac atggaggagc agcagcagcc 2280 catgtaccaa cccactccca cgaaggacaa ggacgtggcg gggcagcccc agccctgagg 2340 accaccattc cctggctact ctctccttcc tccccctttt atcctacctc cctgccttgg 2400 tcccgccaac acatgcgagt acacacacac ccctctctct gcttttgtca ggcagtggta 2460 ggactttggt gtgggtggtg agaaattgta aacaaaaact gacattcata cccaaagaac 2520 cagcctctca ccccagggtc catgtcccag gccccactcc agtgctgccc acactcccag 2580 ctgctggagg agaggggaga tgccaaggtg ccctgcagga cctccctccg ggccacaccc 2640 tcagctgcct cttcaggaac cggagctcat tactgccttc cctcccaggg aggccccttc 2700 agagaggaga ggccaggagc tgcattgtgg ggggacaggc tcaagcaatt ctgtccccat 2760 caaggggtca gctggagaga cccaagaccc tatctgttca ccagggaccc aaaatccaag 2820 gggatgcttc cctctgccct ctttcctgcc cctccccatc atacctgcac ccaccccagc 2880 cagggctccc tgtccagaat tcggttctcc tcaggacgcc aactcccaga gctaaggacc 2940 aaggagaaga acagcctctc cacccccaag ccaggcggtt gaggaacata ttgagaaagg 3000 ttcagattgc agaaacccag ccctgcccct gcctcctgca tccagccccc aacatggtgc 3060 caaagcttcc agaagccaaa aagcttctga tttttaaggt agtgggcatc tctctcctaa 3120 tgacgaagct gctcagcaac tccacctgcc cgccgcagga aggagcagtc ccctgctatc 3180 cctgcagcca ctcccagcac acccgcacac agccagcacc accgccccac cgtgcacttc 3240 tcctctctgg gccttggctt gggaccaggt acgaaggatc cccaagccct tcaggcctga 3300 gatcagagcc agatcagcct taagtcacct cccatccaag aacttggcct aaaaatactc 3360 ccctatttct aaccctcagg acggatctga tattaaatgc cttccctggg aggaagggtg 3420 ctttccccct ccctagaggt gcccattcca taccctggga gactgaggag agcattggct 3480 gaagcccagt tcctttccca tccatcccca actccaataa tcccccactc ctcgcaggtc 3540 tcagtgtcat gctgtcttgg ggcagggtga aagggtagtg gcagcagggc gcccactctg 3600 gagatcctca aaaaaggccc tcctctgtgg ctggcagcct ctgacctttc cctgggcttc 3660 aaaggaaggc tatggagttt gctgtgggcc ctgcaacctt cccagccact cctgctgcac 3720 taaggactta ggatcctttt atcacaaatc gggattctct cccccacccc gaattctgtc 3780 tgcttaaact ggaatacaca ggagcccttc ctggcctgga tggtgtctcc cagcttcccc 3840 gcccagcttg cccaccccat agttggtgag atgccaagtt tggtctgagt tgtgacccct 3900 tcagagtaga tgcccggcag gctggggttg gcccctggag ggtcagggga ccatcttctt 3960 attccctctt ttctcattcc tccaacttcc tcccctcctt caattatttt tttgtaaagt 4020 tgatgcctta ctttttggat aaatattttt gaagctggta tttctatttc ttttggattt 4080 tttttaatgt aaggttgttt tgggggatgg agttagaacc ttaatgataa tttctttcgt 4140 ttggtgtagg ttttagagat ttgttttgtg gagaggtttt tttcttttga tgtaataaaa 4200 tttaaaatgg aaatgaaaaa aaaaaaaaaa aaaaaaa 4237 21 1651 DNA Homo sapiens 21 tgctgaacca agatttagct gtgcgccctc cttgcagtct cctggaacca gcaggaggaa 60 acatggggga tactggcctg agaaagcgga gagaggatga gaagtcgatc cagagccaag 120 agcctaagac caccagtctc caaaaggagc tgggcctcat cagtggcatc tccatcatcg 180 tgggcaccat cattggctct gggatcttcg tttcctccaa gtctgtgctc agcaacacgg 240 aagctgtggg gccctgcctc atcatatggg cggcttgcgg ggtcctcgcg acgctgggtg 300 ccctgtgctt tgcggagctt ggcacaatga tcaccaagtc agggggagag tatccctacc 360 tgatggaggc ctacgggccc atccccgcct acctcttctc ctgggccagc ctgatcgtca 420 ttaagcccac gtccttcgcc atcatctgcc tcagcttctc cgagtatgtg tgtgcgccct 480 tctatgtggg ctgcaagcct cctcaaatcg ttgtgaaatg cctggccgcc gccgccatct 540 tgttcatctc gacagtgaac tcactgagcg tgcggctggg aagctacgtc cagaacatct 600 tcaccgcggc caagctggtg atcgtggcca tcatcatcat cagcgggctg gtgctcctgg 660 cccaaggaaa cacaaagaat tttgataatt ctttcgaggg cgcccagctg tctgtgggag 720 ccatcagcct ggcgttttac aatggactct gggcctatga tggatggaat caactcaatt 780 acatcacaga agaacttaga aacccttaca gaaacctgcc tttggccatt atcatcggga 840 tccccctggt gacggcgtgc tacatcctca tgaacgtgtc ctacttcacc gtgatgactg 900 ccaccgaact cctgcagtcc caggcggtgg ctgtgacatt tggtgaccgt gttctctatc 960 ctgcttcttg gatcgttcca ctttttgtgg cattttcaac catcggtgct gctaacggga 1020 cctgcttcac agcgggcaga ctcatttacg tggcgggccg ggagggtcac atgctcaaag 1080 tgctttctta catcagcgtc aggcgcctca ctccagcccc cgccatcatc ttttatggta 1140 tcatagcaac gatttatatc atccctggtg acataaactc gttagtcaat tatttcagct 1200 ttgctgcatg gctgttttat ggcctgacga ttctaggact catcgtgatg agatttacaa 1260 ggaaagagct ggaaaggcct atcaaggtgc ccgtagtcat tcccgtcttg atgacactca 1320 tctctgtgtt tttggttctg gctccaatca tcagcaagcc cacctgggag tacctctact 1380 gtgtgctgtt tatattaagc ggccttttat tttacttcct gtttgtccac tacaagtttg 1440 gatgggctca gaaaatctca aagccgatta ccatgcacct tcagatgcta atggaagtgg 1500 tcccaccgga ggaagaccct gagtaacaag ctccgtctct tgtagccaag tcagctgaat 1560 ttattttctt aagcaatatt tgtggttatt tcttcctttt tttcctacga ataaaatata 1620 ctcagatgtt taaaaaaaaa aaaaaaaaaa a 1651 22 1654 DNA Homo sapiens 22 gcaggcacgg gcggtcagct gggccgcagc tcctccggct ctgcagggtc acggaggaag 60 tctcctggaa ccagcaggag gaaacatggg ggatactggc ctgagaaagc ggagagagga 120 tgagaagtcg atccagagcc aagagcctaa gaccaccagt ctccaaaagg agctgggcct 180 catcagtggc atctccatca tcgtgggcac catcattggc tctgggatct tcgtttcccc 240 caagtctgtg ctcagcaaca cggaagctgt ggggccctgc ctcatcatat gggcggcttg 300 cggggtcctc gcgacgctgg gtgccctgtg ctttgcggag cttggcacaa tgatcaccaa 360 gtcaggggga gagtatccct acctgatgga ggcctacggg cccatccccg cctacctctt 420 ctcctgggcc agcctgatcg tcattaagcc cacgtccttc gccatcatct gcctcagctt 480 ctccgagtat gtgtgtgcgc ccttctatgt gggctgcaag cctcctcaaa tcgttgtgaa 540 atgcctggcc gccgccgcca tcttgttcat ctcgacagtg aactcactga gcgtgcggct 600 gggaagctac gtccagaaca tcttcaccgc ggccaagctg gtgatcgtgg ccatcatcat 660 catcagcggg ctggtgctcc tggcccaagg aaacacaaag aattttgata attctttcga 720 gggcgcccag ctgtctgtgg gagccatcag cctggcgttt tacaatggac tctgggccta 780 tgatggatgg aatcaactca attacatcac agaagaactt agaaaccctt acagaaacct 840 gcctttggcc attatcatcg ggatccccct ggtgacggcg tgctacatcc tcatgaacgt 900 gtcctacttc accgtgatga ctgccaccga actcctgcag tcccaggcgg tggctgtgac 960 atttggtgac cgtgttctct atcctgcttc ttggatcgtt ccactttttg tggcattttc 1020 aaccatcggt gctgctaacg ggacctgctt cacagcgggc agactcattt acgtggcggg 1080 ccgggagggt cacatgctca aagtgctttc ttacatcagc gtcaggcgcc tcactccagc 1140 ccccgccatc atcttttatg gtatcatagc aacgatttat atcatccctg gtgacataaa 1200 ctcgttagtc aattatttca gctttgccgc atggctgttt tatggcctga cgattctagg 1260 actcatcgtg atgagattta caaggaaaga gctggaaagg cctatcaagg tgcccgtagt 1320 cattcccgtc ttgatgacac tcatctctgt gtttttggtt ctggctccaa tcatcagcaa 1380 gcccacctgg gagtacctct actgtgtgct gtttatatta agcggccttt tattttactt 1440 cctgtttgtc cactacaagt ttggatgggc tcagaaaatc tcaaagccga ttaccatgca 1500 ccttcagatg ctaatggaag tggtcccacc ggaggaagac cctgagtaac aagctccgtc 1560 tcttgtagcc aagtcagctg aatttatttt cttaagcaat atttgtggtt atttcttcct 1620 ttttttctta cgaataaaat atactcagat gttt 1654 23 1814 DNA Homo sapiens 23 ggcacgaggg cacgggcggt cagctgggcc gcagctcctc cggctctgca gggtcacgga 60 ggaaggtaag taagccagct cccctagtcc aggccgagct tgcacttgcg tcttgtctgc 120 tgctgctgaa ccaagattta gctgtgcgcc ctccttgcag tctcctggaa ccagcaggag 180 gaaacatggg ggatactggc ctgagaaagc ggagagagga tgagaagtcg atccagagcc 240 aagagcctaa gaccaccagt ctccaaaagg agctgggcct catcagtggc atctccatca 300 tcgtgggcac catcattggc tctgggatct tcgtttcccc caagtctgtg ctcagcaaca 360 cggaagctgt ggggccctgc ctcatcatat gggcggcttg cggggtcctc gcgacgctgg 420 gtgccctgtg ctttgcggag cttggcacaa tgatcaccaa gtcaggggga gagtatccct 480 acctgatgga ggcctacggg cccatccccg cctacctctt ctcctgggcc agcctgatcg 540 tcattaagcc cacgtccttc gccatcatct gcctcagctt ctccgagtat gtgtgtgcgc 600 ccttctatgt gggctgcaag cctcctcaaa tcgttgtgaa atgcctggcc gccgccgcca 660 tcttgttcat ctcgacagtg aactcactga gcgtgcggct gggaagctac gtccagaaca 720 tcttcaccgc ggccaagctg gtgatcgtgg ccatcatcat catcagcggg ctggtgctcc 780 tggcccaagg aaacacaaag aattttgata attctttcga gggcgcccag ctgtctgtgg 840 gagccatcag cctggcgttt tacaatggac tctgggccta tgatggatgg aatcaactca 900 attacatcac agaagaactt agaaaccctt acagaaacct gcctttggcc attatcatcg 960 ggatccccct ggtgacggcg tgctacatcc tcatgaacgt gtcctacttc accgtgatga 1020 ctgccaccga actcctgcag tcccaggcgg tggctgtgac atttggtgac cgtgttctct 1080 atcctgcttc ttggatcgtt ccactttttg tggcattttc aaccatcggt gctgctaacg 1140 ggacctgctt cacagcgggc agactcattt acgtggcggg ccgggagggt cacatgctca 1200 aagtgctttc ttacatcagc gtcaggcgcc tcactccagc ccccgccatc atcttttatg 1260 gtatcatagc aacgatttat atcatccctg gtgacataaa ctcgttagtc aattatttca 1320 gctttgctgc atggctgttt tatggcctga cgattctagg actcatcgtg atgagattta 1380 caaggaaaga gctggaaagg cctatcaagg tgcccgtagt cattcccgtc ttgatgacac 1440 tcatctctgt gtttttggtt ctggctccaa tcatcagcaa gcccacctgg gagtacctct 1500 actgtgtgct gtttatatta agcggccttt tattttactt cctgtttgtc cactacaagt 1560 ttggatgggc tcagaaaatc tcaaagccga ttaccatgca ccttcagatg ctaatggaag 1620 tggtcccacc ggaggaagac cctgagtaac aagctccgtc tcttgtagcc aagtcagctg 1680 aatttatttt cttaagcaat atttgtggtt atttcttcct ttttttctta cgaataaaat 1740 atactcagat gtttaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800 aaaaaaaaaa aaaa 1814 24 1476 DNA Homo sapiens 24 aacatggggg atactggcct gagaaagcgg agagaggatg agaagtcgat ccagagccaa 60 gagcctaaga ccaccagtct ccaaaaggag ctgggcctca tcagtggcat ctccatcatc 120 gtgggcacca tcattggctc tgggatcttc gtttccccca agtctgtgct cagcaacacg 180 gaagctgtgg ggccctgcct catcatatgg gcggcttgcg gggtcctcgc gacgctgggt 240 gccctgtgct ttgcggagct tggcacaatg atcaccaagt cagggggaga gtatccctac 300 ctgatggagg cctacgggcc catccccgcc tacctcttct cctgggccag cctgatcgtc 360 attaagccca cgtccttcgc catcatctgc ctcagcttct ccgagtatgt gtgtgcgccc 420 ttctatgtgg gctgcaagcc tcctcaaatc gttgtgaaat gcctggccgc cgccgccatc 480 ttgttcatct cgacagtgaa ctcactgagc gtgcggctgg gaagctacgt ccagaacatc 540 ttcaccgcgg ccaagctggt gatcgtggcc atcatcatca tcagcgggct ggtgctcctg 600 gcccaaggaa acacaaagaa ttttgataat tctttcgagg gcgcccagct gtctgtggga 660 gccatcagcc tggcgtttta caatggactc tgggcctatg atggatggaa tcaactcaat 720 tacatcacag aagaacttag aaacccttac agaaacctgc ctttggccat tatcatcggg 780 atccccctgg tgacggcgtg ctacatcctc atgaacgtgt cctacttcac cgtgatgact 840 gccaccgaac tcctgcagtc ccaggcggtg gctgtgacat ttggtgaccg tgttctctat 900 cctgcttctt ggatcgttcc actttttgtg gcattttcaa ccatcggtgc tgctaacggg 960 acctgcttca cagcgggcag actcatttac gtggcgggcc gggagggtca catgctcaaa 1020 gtgctttctt acatcagcgt caggcgcctc actccagccc ccgccatcat cttttatggt 1080 atcatagcaa cgatttatat catccctggt gacataaact cgttagtcaa ttatttcagc 1140 tttgccgcat ggctgtttta tggcctgacg attctaggac tcatcgtgat gagatttaca 1200 aggaaagagc tggaaaggcc tatcaaggtg cccgtagtca ttcccgtctt gatgacactc 1260 atctctgtgt ttttggttct ggctccaatc atcagcaagc ccacctggga gtacctctac 1320 tgtgtgctgt ttatattaag cggcctttta ttttacttcc tgtttgtcca ctacaagttt 1380 ggatgggctc agaaaatctc aaagccgatt accatgcacc ttcagatgct aatggaagtg 1440 gtcccaccgg aggaagaccc tgagtaacaa gctccg 1476 25 1664 DNA Homo sapiens 25 cgagcaggca cgggcggtca gctgggccgc agctcctccg gctctgcagg gtcacggagg 60 aagtctcctg gaaccagcag gaggaaacat gggggatact ggcctgagaa agcggagaga 120 ggatgagaag tcgatccaga gccaagagcc taagaccacc agtctccaaa aggagctggg 180 cctcatcagt ggcatctcca tcatcgtggg caccatcatt ggctctggga tcttcgtttc 240 ccccaagtct gtgctcagca acacggaagc tgtggggccc tgcctcatca tatgggcggc 300 ttgcggggtc ctcgcgacgc tgggtgccct gtgctttgcg gagcttggca caatgatcac 360 caagtcaggg ggagagtatc cctacctgat ggaggcctac gggcccatcc ccgcctacct 420 cttctcctgg gccagcctga tcgtcattaa gcccacgtcc ttcgccatca tctgcctcag 480 cttctccgag tatgtgtgtg cgcccttcta tgtgggctgc aagcctcctc aaatcgttgt 540 gaaatgcctg gccgccgccg ccatcttgtt catctcgaca gtgaactcac tgagcgtgcg 600 gctgggaagc tacgtccaga acatcttcac cgcggccaag ctggtgatcg tggccatcat 660 catcatcagc gggctggtgc tcctggccca aggaaacaca aagaattttg ataattcttt 720 cgagggcgcc cagctgtctg tgggagccat cagcctggcg ttttacaatg gactctgggc 780 ctatgatgga tggaatcaac tcaattacat cacagaagaa cttagaaacc cttacagaaa 840 cctgcctttg gccattatca tcgggatccc cctggtgacg gcgtgctaca tcctcatgaa 900 cgtgtcctac ttcaccgtga tgactgccac cgaactcctg cagtcccagg cggtggctgt 960 gacatttggt gaccgtgttc tctatcctgc ttcttggatc gttccacttt ttgtggcatt 1020 ttcaaccatc ggtgctgcta acgggacctg cttcacagcg ggcagactca tttacgtggc 1080 gggccgggag ggtcacatgc tcaaagtgct ttcttacatc agcgtcaggc gcctcactcc 1140 agcccccgcc atcatctttt atggtatcat agcaacgatt tatatcatcc ctggtgacat 1200 aaactcgtta gtcaattatt tcagctttgc cgcatggctg ttttatggcc tgacgattct 1260 aggactcatc gtgatgagat ttacaaggaa agagctggaa aggcctatca aggtgcccgt 1320 agtcattccc gtcttgatga cactcatctc tgtgtttttg gttctggctc caatcatcag 1380 caagcccacc tgggagtacc tctactgtgt gctgtttata ttaagcggcc ttttatttta 1440 cttcctgttt gtccactaca agtttggatg ggctcagaaa atctcaaagc cgattaccat 1500 gcaccttcag atgctaatgg aagtggtccc accggaggaa gaccctgagt aacaagctcc 1560 gtctcttgta gccaagtcag ctgaatttat tttcttaagc aatatttgtg gttatttctt 1620 cctttttttc ttacgaataa aatatactca gatgtttaaa aaaa 1664 26 1918 DNA Homo sapiens 26 cggctgcgag ggccgtgagc tcacggaccg acggaccgac gggcggccgg ccggacagac 60 ggggcagcgc agggagcggg gacgcggcgg gacagcgaca tggccggcca cacgcagcag 120 ccgagcgggc gcgggaaccc caggcctgcg ccctcgccct ccccagtccc agggaccgtc 180 cccggcgcct cggagcgggt ggcgctcaag aaggagatcg ggctgctgag cgcctgcacc 240 atcatcatcg ggaacatcat cggctcgggc atcttcatct cgcccaaggg ggtcctggag 300 cactcaggct ccgtgggtct ggccctgttc gtctgggtcc tgggtggggg cgtgacggct 360 ctgggctccc tctgctatgc agagctggga gtcgccatcc ccaagtctgg cggggactac 420 gcctacgtca cagagatctt cgggggcctg gctggctttc tgctgctctg gagcgccgtc 480 ctcatcatgt accccaccag ccttgctgtc atctccatga ccttctccaa ctacgtgctg 540 cagcccgtgt tccccaactg catccccccc accacagcct cccgggtgct gtccatggcc 600 tgcctgatgc tcctgacatg ggtgaacagc tccagtgtgc gctgggccac gcgcatccag 660 gacatgttca caggcgggaa gctgctggcc ttgtccctca tcatcggcgt gggccttctc 720 cagatcttcc aaggacactt cgaggagctg aggcccagca atgcctttgc tttctggatg 780 acgccctccg tgggacacct ggccctggcc ttcctccagg gctccttcgc cttcagtggc 840 tggaacttcc tcaactatgt caccgaggag atggttgacg cccgaaagaa cctacctcgc 900 gccatcttca tctccatccc actggtgacc ttcgtgtaca cgttcaccaa cattgcctac 960 ttcacggcca tgtcccccca ggagctgctc tcctccaatg cggtggctgt gaccttcggg 1020 gagaagctgc tgggctactt ttcttgggtc atgcctgtct ccgtggctct gtcaaccttc 1080 ggagggatca atggttacct gttcacctac tccaggctgt gcttctctgg agcccgcgag 1140 gggcacctgc ccagcctgct ggccatgatc cacgtcagac actgcacccc catccccgcc 1200 ctcctcgtct gttgcggggc cacagccgtc atcatgctcg tgggcgacac gtacacgctc 1260 atcaactatg tgtccttcat caactacctc tgctacggcg tcaccatcct gggcctgctg 1320 ctgctgcgct ggaggcggcc tgcactccac aggcccatca aggtgaacct tctcatcccc 1380 gtggcgtact tggtcttctg ggccttcctg ctggtcttca gcttcatctc agagcctatg 1440 gtctgtgggg tcggcgtcat catcatcctt acgggggtgc ccattttctt tctgggagtg 1500 ttctggagaa gcaaaccaaa gtgtgtgcac agactcacag agtccatgac acactggggc 1560 caggagctgt gtttcgtggt ctacccccag gacgcccccg aagaggagga gaatggcccc 1620 tgcccaccct ccctgctgcc tgccacagac aagccctcga agccacaatg agatttttgt 1680 agagactgaa gcagttgttt ctgtttacat gttgtttatt gaggaggtgt tttggcaaaa 1740 aagttttgtt ttgttttttt ctggaaaaaa aagaaaaaag atacgactct cagaagcctg 1800 ttttaaggaa gccctaaaat gtggactggg tttcctgtct tagcactgcc ctgctagctc 1860 ttcctgaaaa ggcctataaa taaacagggc tggctgttaa aaaaaaaaaa aaaaaaaa 1918 27 1595 DNA Homo sapiens 27 cggcgggaca gcgacatggc cggccacacg cagcagccga gcgggcgcgg gaaccccagg 60 cctgcgccct cgccctcccc agtcccaggg accgtccccg gcgcctcgga gcgggtggcg 120 ctcaagaagg agatcgggct gctgagcgcc tgcaccatca tcatcgggaa catcatcggc 180 tcgggcatct tcatctcgcc caagggggtc ctggagcact caggctccgt gggtctggcc 240 ctgttcgtct gggtcctggg tgggggcgtg acggctctgg gctccctctg ctatgcagag 300 ctgggagtcg ccatccccaa gtctggcggg gactacgcct acgtcacaga gatcttcggg 360 ggcctggctg gctttctgct gctctggagc gccgtcctca tcatgtaccc caccagcctt 420 gctgtcatct ccatgacctt ctccaactac gtgctgcagc ccgtgttccc caactgcatc 480 ccccccacca cagcctcccg ggtgctgtcc atggcctgcc tgatgctcct gacatgggtg 540 aacagctcca gtgtgcgctg ggccacgcgc atccaggaca tgttcacagg cgggaagctg 600 ctggccttgt ccctcatcat cggcgtgggc cttctccaga tcttccaagg acacttcgag 660 gagctgaggc ccagcaatgc ctttgctttc tggatgacgc cctccgtggg acacctggcc 720 ctggccttcc tccagggctc cttcgccttc agtggctgga acttcctcaa ctatgtcacc 780 gaggagatgg ttgacgcccg aaagaaccta cctcgcgcca tcttcatctc catcccactg 840 gtgaccttcg tgtacacgtt caccaacatt gcctacttca cggccatgtc cccccaggag 900 ctgctctcct ccaatgcggt ggctgtgacc ttcggggaga agctgctggg ctacttttct 960 tgggtcatgc ctgtctccgt ggctctgtca accttcggag ggatcaatgg ttacctgttc 1020 acctactcca ggctgtgctt ctctggagcc cgcgaggggc acctgcccag cctgctggcc 1080 atgatccacg tcagacactg cacccccatc cccgccctcc tcgtctgttg cggggccaca 1140 gccgtcatca tgctcgtggg cgacacgtac acgctcatca actatgtgtc cttcatcaac 1200 tacctctgct acggcgtcac catcctgggc ctgctgctgc tgcgctggag gcggcctgca 1260 ctccacaggc ccatcaaggt gaaccttctc atccccgtgg cgtacttggt cttctgggcc 1320 ttcctgctgg tcttcagctt catctcagag cctatggtct gtggggtcgg cgtcatcatc 1380 atccttacgg gggtgcccat tttctttctg ggagtgttct ggagaagcaa accaaagtgt 1440 gtgcacagac tcacagagtc catgacacac tggggccagg agctgtgttt cgtggtctac 1500 ccccaggacg cccccgaaga ggaggagaat ggcccctgcc caccctccct gctgcctgcc 1560 acagacaagc cctcgaagcc acaatgagat ttttg 1595 28 1962 DNA Homo sapiens 28 agtcccgggc ccggagccag cgcatgcgcc cgcctgtggg cgctgtcccg gctgcgaggg 60 ccgtgagctc acggaccgac ggaccgacgg gcggccggcc ggacagacgg ggcagcgcag 120 ggagcgggga cgcggcggga cagcgacatg gccggccaca cgcagcagcc gagcgggcgc 180 gggaacccca ggcctgcgcc ctcgccctcc ccagtcccag ggaccgtccc cggcgcctcg 240 gagcgggtgg cgctcaagaa ggagatcggg ctgctgagcg cctgcaccat catcatcggg 300 aacatcatcg gctcgggcat cttcatctcg cccaaggggg tcctggagca ctcaggctcc 360 gtgggtctgg ccctgttcgt ctgggtcctg ggtgggggcg tgacggctct gggctccctc 420 tgctatgcag agctgggagt cgccatcccc aagtctggcg gggactacgc ctacgtcaca 480 gagatcttcg ggggcctggc tggctttctg ctgctctgga gcgccgtcct catcatgtac 540 ccaccagcct tgctgtcatc tccgtgacct tctccaacta cgtgctgcag cccgtgttcc 600 ccaactgcat cccccccacc acagcctccc gggtgctgtc catggcctgc ctgatgctcc 660 tgacatgggt gaacagctcc agtgtgcgct gggccacgcg catccaggac atgttcacag 720 gcgggaagct gctggccttg tccctcatca tcggcgtggg ccttctccag atcttccaag 780 gacacttcga ggagctgagg cccagcaatg cctttgcttt ctggatgacg ccctccgtgg 840 gacacctggc cctggccttc ctccagggct ccttcgcctt cagtggctgg aacttcctca 900 actatgtcac cgaggagatg gttgacgccc gaaagaacct acctcgcgcc atcttcatct 960 ccatcccact ggtgaccttc gtgtacacgt tcaccaacat tgcctacttc acggccatgt 1020 ccccccagga gctgctctcc tccaatgcgg tggctgtgac cttcggggag aagctgctgg 1080 gctacttttc ttgggtcatg cctgtctccg tggctctgtc aaccttcgga gggatcaatg 1140 gttacctgtt cacctactcc aggctgtgct tctctggagc ccgcgagggg cacctgccca 1200 gcctgctggc catgatccac gtcagacact gcacccccat ccccgccctc ctcgtctgtt 1260 gcggggccac agccgtcatc atgctcgtgg gcgacacgta cacgctcatc aactatgtgt 1320 ccttcatcaa ctacctctgc tacggcgtca caatcctggg cctgctgctg ctgcgctgga 1380 ggcggcctgc actccacagg cccatcaagg tgaaccttct catccccgtg gcgtacttgg 1440 tcttctgggc cttcctgctg gtcttcagct tcatctcaga gcctatggtc tgtggggtcg 1500 gcgtcatcat catccttacg ggggtgccca ttttctttct gggagtgttc tggagaagca 1560 aaccaaagtg tgtgcacaga ctcacagagt ccatgacaca ctggggccag gagctgtgtt 1620 tcgtggtcta cccccaggac gcccccgaag aggaggagaa tggcccctgc ccaccctccc 1680 tgctgcctgc cacagacaag ccctcgaagc cacaatgaga tttttgtaga gactgaagca 1740 gttgtttctg tttacatgtt gtttattgag gaggtgtttt ggcaaaaaag ttttgttttg 1800 tttttttctg gaaaaaaaag aaaaaagata cgactctaag aagcctgttt taaggaagcc 1860 ctaaaatgtg gactgggttt cctgtcttag cactgccctg ctagctcttc ctgaaaaggc 1920 ctataaataa acagggctgg ctgttcaaaa aaaaaaaaaa aa 1962 29 2482 DNA Homo sapiens 29 cgaggaggtg gagaattgag agcacgatgc atacacaggt gtttctgagt agtaattaga 60 tcgctgtgaa ggaaaaagca cacctttgag ttttcacctg tgaacactat agcgctgaga 120 gagacagtct gaaagcagag gaagacatcg atcagtaaca ccaagagaca ccaaagttga 180 aagttttgtt ttctttccct ctgttttatt tttcccccgt gtgtccctac tatggtcaga 240 aagcctgttg tgtccaccat ctccaaagga ggttacctgc agggaaatgt taacgggagg 300 ctgccttccc tgggcaacaa ggagccacct gggcaggaga aagtgcagct gaagaggaaa 360 gtcactttac tgaggggagt ctccattatc attggcacca tcattggagc aggaatcttc 420 atctctccta agggcgtgct ccagaacacg ggcagcgtgg gcatgtctct gaccatctgg 480 acggtatgtg gggtcctgtc actatttgga gctttgtctt atgctgaatt gggaacaact 540 ataaagaaat ctggaggtca ttacacatat attttggaag tctttggtcc attaccagct 600 tttgtacgag tctgggtgga actcctcata atacgccctg cagctactgc tgtgatatcc 660 ctggcatttg gacgctacat tctggaacca ttttttattc aatgtgaaat ccctgaactt 720 gcgatcaagc tcattacagc tgtgggcata actgtagtga tggtcctaaa tagcatgagt 780 gtcagctgga gcgcccggat ccagattttc ttaacctttt gcaagctcac agcaattctg 840 ataattatag tccctggagt tatgcagcta attaaaggtc aaacgcagaa ctttaaagac 900 gccttttcag gaagagattc aagtattacg cggttgccac tggcttttta ttatggaatg 960 tatgcatatg ctggctggtt ttacctcaac tttgttactg aagaagtaga aaaccctgaa 1020 aaaaccattc cccttgcaat atgtatatcc atggccattg tcaccattgg ctatgtgctg 1080 acaaatgtgg cctactttac gaccattaat gctgaggagc tgctgctttc aaatgcagtg 1140 gcagtgacct tttctgagcg gctactggga aatttctcat tagcagttcc gatctttgtt 1200 gccctctcct gctttggctc catgaacggt ggtgtgtttg ctgtctccag gttattctat 1260 gttgcgtctc gagagggtca ccttccagaa atcctctcca tgattcatgt ccgcaagcac 1320 actcctctac cagctgttat tgttttgcac cctttgacaa tgataatgct cttctctgga 1380 gacctcgaca gtcttttgaa tttcctcagt tttgccaggt ggctttttat tgggctggca 1440 gttgctgggc tgatttatct tcgatacaaa tgcccagata tgcatcgtcc tttcaaggtg 1500 ccactgttca tcccagcttt gttttccttc acatgcctct tcatggttgc cctttccctc 1560 tattcggacc catttagtac agggattggc ttcgtcatca ctctgactgg agtccctgcg 1620 tattatctct ttattatatg ggacaagaaa cccaggtggt ttagaataat gtcagagaaa 1680 ataaccagaa cattacaaat aatactggaa gttgtaccag aagaagataa gttatgaact 1740 aatggacttg agatcttggc aatctgccca aggggagaca caaaataggg atttttactt 1800 cattttctga aagtctagag aattacaact ttggtgataa acaaaaggag tcagttattt 1860 ttattcatat attttagcat attcgaacta atttctaaga aatttagtta taactctatg 1920 tagttataga aagtgaatat gcagttattc tatgagtcgc acaattcttg agtctctgat 1980 acctacctat tggggttagg agaaaagact agacaattac tatgtggtca ttctctacaa 2040 catatgttag cacggcaaag aaccttcaaa ttgaagactg agatttttct gtatatatgg 2100 gttttgtaag atggttttac acactacaga tgtctatact gtgaaaagtg ttttcaattc 2160 tgaaaaaaag catacatcat gattatggca aagaggagag aaagaaattt attttacatt 2220 gacattgcat tgcttcccct tagataccaa tttagataac aaacactcat gctttaatgg 2280 attataccca gagcactttg aacaaaggtc agtggggatt gttgaataca ttaaagaaga 2340 gtttctaggg gctactgttt atgagacaca tccaggagtt atgtttaagt aaaaatcctt 2400 gagaatttaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2460 aaaaaaaaaa aaaaaaaaaa aa 2482 30 1861 DNA Homo sapiens 30 ggaacgagga ggtggagaat tgagagcacg atgcatacac aggtgtttct gagtagtaat 60 tagatcgctg tgaaggaaaa agcacacctt tgagttttca cctgtgaaca ctatagcgct 120 gagagagaca gtctgaaagc agaggaagac atcgatcagt aacaccaaga gacaccaaag 180 ttgaaagttt tgttttcttt ccctctgttt tatttttccc ccgtgtgtcc ctactatggt 240 cagaaagcct gttgtgtcca ccatctccaa aggaggttac ctgcagggaa atgttaacgg 300 gaggctgcct tccctgggca acaaggagcc acctgggcag gagaaagtgc agctgaagag 360 gaaagtcact ttactgaggg gagtctccat tatcattggc accatcattg gagcaggaat 420 cttcatctct cctaagggcg tgctccagaa cacgggcagc gtgggcatgt ctctgaccat 480 ctggacggtg tgtggggtcc tgtcactatt tggagctttg tcttatgctg aattgggaac 540 aactataaag aaatctggag gtcattacac atatattttg gaagtctttg gtccattacc 600 agcttttgta cgagtctggg tggaactcct cataatacgc cctgcagcta ctgctgtgat 660 atccctggca tttggacgct acattctgga accatttttt attcaatgtg aaatccctga 720 acttgcgatc aagctcatta cagctgtggg cataactgta gtgatggtcc taaatagcat 780 gagtgtcagc tggagcgccc ggatccagat tttcttaacc ttttgcaagc tcacagcaat 840 tctgataatt atagtccctg gagttatgca gctaattaaa ggtcaaacgc agaactttaa 900 agacgcgttt tcaggaagag attcaagtat tacgcggttg ccactggctt tttattatgg 960 aatgtatgca tatgctggct ggttttacct caactttgtt actgaagaag tagaaaaccc 1020 tgaaaaaacc attccccttg caatatgtat atccatggcc attgtcacca ttggctatgt 1080 gctgacaaat gtggcctact ttacgaccat taatgctgag gagctgctgc tttcaaatgc 1140 agtggcagtg accttttctg agcggctact gggaaatttc tcattagcag ttccgatctt 1200 tgttgccctc tcctgctttg gctccatgaa cggtggtgtg tttgctgtct ccaggttatt 1260 ctatgttgcg tctcgagagg gtcaccttcc agaaatcctc tccatgattc atgtccgcaa 1320 gcacactcct ctaccagctg ttattgtttt gcaccctttg acaatgataa tgctcttctc 1380 tggagacctc gacagtcttt tgaatttcct cagttttgcc aggtggcttt ttattgggct 1440 ggcagttgct gggctgattt atcttcgata caaatgccca gatatgcatc gtcctttcaa 1500 ggtgccactg ttcatcccag ctttgttttc cttcacatgc ctcttcatgg ttgccctttc 1560 cctctattcg gacccattta gtacagggat tggcttcgtc atcactctga ctggagtccc 1620 tgcgtattat ctctttatta tatgggacaa gaaacccagg tggtttagaa taatgtcaga 1680 gaaaataacc agaacattac aaataatact ggaagttgta ccagaagaag ataagttatg 1740 aactaatgga cttgagatct tggcaatctg cccaagggga gacacaaaat agggattttt 1800 acttcatttt ctgaaagtct agagaattac aactttggtg ataaaaaaaa aaaaaaaaaa 1860 a 1861 31 3144 DNA Homo sapiens 31 atggtcagaa agcctgttgt gtccaccatc tccaaaggag gttacctgca gggaaatgtt 60 aacgggaggc tgccttccct gggcaacaag gagccacctg ggcaggagaa agtgcagctg 120 aagaggaaag tcactttact gaggggagtc tccattatca ttggcaccat cattggagca 180 ggaatcttca tctctcctaa gggcgtgctc cagaacacgg gcagcgtggg catgtctctg 240 accatctgga cggtgtgtgg ggtcctgtca ctatttggag ctttgtctta tgctgaattg 300 ggaacaacta taaagaaatc tggaggtcat tacacatata ttttggaagt ctttggtcca 360 ttaccagctt ttgtacgagt ctgggtggaa ctcctcataa tacgccctgc agctactgct 420 gtgatatccc tggcatttgg acgctacatt ctggaaccat tttttattca atgtgaaatc 480 cctgaacttg cgatcaagct cattacagct gtgggcataa ctgtagtgat ggtcctaaat 540 agcatgagtg tcagctggag cgcccggatc cagattttct taaccttttg caagctcaca 600 gcaattctga taattatagt ccctggagtt atgcagctaa ttaaaggtca aacgcagaac 660 tttaaagacg cgttttcagg aagagattca agtattacgc ggttgccact ggctttttat 720 tatggaatgt atgcatatgc tggctggttt tacctcaact ttgttactga agaagtagaa 780 aaccctgaaa aaaccattcc ccttgcaata tgtatatcca tggccattgt caccattggc 840 tatgtgctga caaatgtggc ctactttacg accattaatg ctgaggagct gctgctttca 900 aatgcagtgg cagtgacctt ttctgagcgg ctactgggaa atttctcatt agcagttccg 960 atctttgttg ccctctcctg ctttggctcc atgaacggtg gtgtgtttgc tgtctccagg 1020 ttattctatg ttgcgtctcg agagggtcac cttccagaaa tcctctccat gattcatgtc 1080 cgcaagcaca ctcctctacc agctgttatt gttttgcacc ctttgacaat gataatgctc 1140 ttctctggag acctcgacag tcttttgaat ttcctcagtt ttgccaggtg gctttttatt 1200 gggctggcag ttgctgggct gatttatctt cgatacaaat gcccagatat gcatcgtcct 1260 ttcaaggtgc cactgttcat cccagctttg ttttccttca catgcctctt catggttgcc 1320 ctttccctct attcggaccc atttagtaca gggattggct tcgtcatcac tctgactgga 1380 gtccctgcgt attatctctt tattatatgg gacaagaaac ccaggtggtt tagaataatg 1440 tcagagaaaa taaccagaac attacaaata atactggaag ttgtaccaga agaagataag 1500 ttatgaacta atggacttga gatcttggca atctgcccaa ggggagacac aaaataggga 1560 tttttacttc attttctgaa agtctagaga attacaactt tggtgataaa caaaaggagt 1620 cagttatttt tattcatata ttttagcata ttcgaactaa tttctaagaa atttagttat 1680 aactctatgt agttatagaa agtgaatatg cagttattct atgagtcgca caattcttga 1740 gtctctgata cctacctatt ggggttagga gaaaagacta gacaattact atgtggtcat 1800 tctctacaac atatgttagc acggcaaaga accttcaaat tgaagactga gatttttctg 1860 tatatatggg ttttgtaaag atggttttac acactacaga tgtctatact gtgaaaagtg 1920 ttttcaattc tgaaaaaaag catacatcat gattatggca aagaggagag aaagaaattt 1980 attttacatt gacattgcat tgcttcccct tagataccaa tttagataac aaacactcat 2040 gctttaatgg attataccca gagcactttg aacaaaggtc agtggggatt gttgaataca 2100 ttaaagaaga gtttctaggg gctactgttt atgagacaca tccaggagtt atgtttaagt 2160 aaaaatcctt gagaatttat tatgtcagat gttttttcat tcattatcag gaagttttag 2220 ttatctgtca tttttttttt tcacatcagt ttgatcagga aagtgtataa cacatcttag 2280 agcaagagtt agtttggtat taaatcctca ttagaacaac cacctgtttc actaataact 2340 tacccctgat gagtctatct aaacatatgc attttaagcc ttcaaattac attatcaaca 2400 tgagagaaat aaccaacaaa gaagatgttc aaaataatag tcccatatct gtaatcatat 2460 ctacatgcaa tgttagtaat tctgaagttt tttaaattta tggctatttt tacacgatga 2520 tgaattttga cagtttgtgc attttcttta tacattttat attcttctgt taaaatatct 2580 cttcagatga aactgtccag attaattagg aaaaggcata tattaacata aaaattgcaa 2640 aagaaatgtc gctgtaaata agatttacaa ctgatgtttc tagaaaattt ccacttctat 2700 atctaggctt tgtcagtaat ttccacacct taattatcat tcaacttgca aaagagacaa 2760 ctgataagaa gaaaattgaa atgagaatct gtggataagt gtttgtgttc agaagatgtt 2820 gttttgccag tattagaaaa tactgtgagc cgggcatggt ggcttacatc tgtaatccca 2880 gcactttggg aggctgaggg ggtggatcac ctgaggtcgg gagttctaga ccagcctgac 2940 caacatggag aaaccccatc tctactaaaa atacaaaatt agctgggcat ggtggcacat 3000 gctggtaatc tcagctattg aggaggctga ggcaggagaa ttgcttgaac ccgggaggcg 3060 gaggttgcag tgagccaaga ttgcaccact gtactccagc ctgggtgaca aagtcagact 3120 ccatctccaa aaaaaaaaaa aaaa 3144 32 520 DNA Homo sapiens 32 atggtcagaa agcctgttgt gtccaccatc tccaaaggag gttacctgca gggaaatgtt 60 aacgggaggc tgccttccct gggcaacaag gagccacctg ggcaggagaa agtgcagctg 120 aagaggaaag tcactttact gaggggagtc tccattatca ttggcaccat cattggagca 180 ggaatcttca tctctcctaa gggcgtgctc cagaacacgg gcagcgtggg catgtctctg 240 accatctgga cggtgtgtgg ggtcctgtca ctatttggag ctttgtctta tgctgaattg 300 ggaacaacta taaagaaatc tggaggtcat tacacatata ttttggaagt ctttggtcca 360 ttaccagctt ttgtacgagt ctgggtggaa ctcctcataa tacgccctgc agctactgct 420 gtgatatccc tggcatttgg acgctacatt ctggaaccat tttttattca atgtgaaatc 480 cctgaacttg cgatcaagct cattacagct gtgggcataa 520 33 1542 DNA Homo sapiens 33 tccctactat ggtcagaaag cctgttgtgt ccaccatctc caaaggaggt tacctgcagg 60 gaaatgttaa cgggaggctg ccttccctgg gcaacaagga gccacctggg caggagaaag 120 tgcagctgaa gaggaaagtc actttactga ggggagtctc cattatcatt ggcaccatca 180 ttggagcagg aatcttcatc tctcctaagg gcgtgctcca gaacacgggc agcgtgggca 240 tgtctctgac catctggacg gtgtgtgggg tcctgtcact atttggagct ttgtcttatg 300 ctgaattggg aacaactata aagaaatctg gaggtcatta cacatatatt ttggaagtct 360 ttggtccatt accagctttt gtacgagtct gggtggaact cctcataata cgccctgcag 420 ctactgctgt gatatccctg gcatttggac gctacattct ggaaccattt tttattcaat 480 gtgaaatccc tgaacttgcg atcaagctca ttacagctgt gggcataact gtagtgatgg 540 tcctaaatag catgagtgtc agctggagcg cccggatcca gattttctta accttttgca 600 agctcacagc aattctgata attatagtcc ctggagttat gcagctaatt aaaggtcaaa 660 cgcagaactt taaagacgcc ttttcaggaa gagattcaag tattacgcgg ttgccactgg 720 ctttttatta tggaatgtat gcatatgctg gctggtttta cctcaacttt gttactgaag 780 aagtagaaaa ccctgaaaaa accattcccc ttgcaatatg tatatccatg gccattgtca 840 ccattggcta tgtgctgaca aatgtggcct actttacgac cattaatgct gaggagctgc 900 tgctttcaaa tgcagtggca gtgacctttt ctgagcggct actgggaaat ttctcattag 960 cagttccgat ctttgttgcc ctctcctgct ttggctccat gaacggtggt gtgtttgctg 1020 tctccaggtt attctatgtt gcgtctcgag agggtcacct tccagaaatc ctctccatga 1080 ttcatgtccg caagcacact cctctaccag ctgttattgt tttgcaccct ttgacaatga 1140 taatgctctt ctctggagac ctcgacagtc ttttgaattt cctcagtttt gccaggtggc 1200 tttttattgg gctggcagtt gctgggctga tttatcttcg atacaaatgc ccagatatgc 1260 atcgtccttt caaggtgcca ctgttcatcc cagctttgtt ttccttcaca tgcctcttca 1320 tggttgccct ttccctctat tcggacccat ttagtacagg gattggcttc gtcatcactc 1380 tgactggagt ccctgcgtat tatctcttta ttatatggga caagaaaccc aggtggttta 1440 gaataatgtc agagaaaata accagaacat tacaaataat actggaagtt gtaccagaag 1500 aagataagtt atgaactaat ggacttgaga tctggcaatc tg 1542 34 2000 DNA Homo sapiens 34 cctgtgaaca ctatagcgct gagagagaca gtctgaaagc agaggaagac atcgatcagt 60 aacaccaaga gacaccaaag ttgaaagttt tgttttcttt ccctctgttt tatttttccc 120 ccgtgtgtcc ctactatggt cagaaagcct gttgtgtcca ccatctccaa aggaggttac 180 ctgcagggaa atgttaacgg gaggctgcct tccctgggca acaaggagcc acctgggcag 240 gagaaagtgc agctgaagag gaaagtcact ttactgaggg gagtctccat tatcattggc 300 accatcattg gagcaggaat cttcatctct cctaagggcg tgctccagaa cacgggcagc 360 gtgggcatgt ctctgaccat ctggacggtg tgtggggtcc tgtcactatt tggagctttg 420 tcttatgctg aattgggaac aactataaag aaatctggag gtcattacac atatattttg 480 gaagtctttg gtccattacc agcttttgta cgagtctggg tggaactcct cataatacgc 540 cctgcagcta ctgctgtgat atccctggca tttggacgct acattctgga accatttttt 600 attcaatgtg aaatccctga acttgcgatc aagctcatta cagctgtggg cataactgta 660 gtgatggtcc taaatagcat gagtgtcagc tggagcgccc ggatccagat tttcttaacc 720 ttttgcaagc tcacagcaat tctgataatt atagtccctg gagttatgca gctaattaaa 780 ggtcaaacgc agaactttaa agacgccttt tcaggaagag attcaagtat tacgcggttg 840 ccactggctt tttattatgg aatgtatgca tatgctggct ggttttacct caactttgtt 900 actgaagaag tagaaaaccc tgaaaaaacc attccccttg caatatgtat atccatggcc 960 attgtcacca ttggctatgt gctgacaaat gtggcctact ttacgaccat taatgctgag 1020 gagctgctgc tttcaaatgc agtggcagtg accttttctg agcggctact gggaaatttc 1080 tcattagcag ttccgatctt tgttgccctc tcctgctttg gctccatgaa cggtggtgtg 1140 tttgctgtct ccaggttatt ctatgttgcg tctcgagagg gtcaccttcc agaaatcctc 1200 tccatgattc atgtccgcaa gcacactcct ctaccagctg ttattgtttt gcaccctttg 1260 acaatgataa tgctcttctc tggagacctc gacagtcttt tgaatttcct cagttttgcc 1320 aggtggcttt ttattgggct ggcagttgct gggctgattt atcttcgata caaatgccca 1380 gatatgcatc gtcctttcaa ggtgccactg ttcatcccag ctttgttttc cttcacatgc 1440 ctcttcatgg ttgccctttc cctctattcg gacccattta gtacagggat tggcttcgtc 1500 atcactctga ctggagtccc tgcgtattat ctctttatta tatgggacaa gaaacccagg 1560 tggtttagaa taatgtcagg gttcctagca ctgatgcctg cacaagcatg tgatatgtga 1620 aataaaatgg attcttctat agctaaatga gttccctctg gggagagttc tggtactgca 1680 atcacaatgc cagatggtgt ttatgggcta tttgtgtaag taagtggtaa gatgctatga 1740 agtaagtgtg tttgttttca tcttatggaa actcttgatg catgtgcttt tgtatggaat 1800 aaattttggt gcaatatgat gtcattcaac tttgcattga attgaatttt ggttgtattt 1860 atatgtatta tacctgtcac gcttctagtt gcttcaacca ttttataacc atttttgtac 1920 atattttact tgaaaatatt ttaaatggaa atttaaataa acatttgata gtttacataa 1980 taaaaaaaaa aaaaaaaaaa 2000 35 2114 DNA Homo sapiens 35 cggcgggcgg cgagcggcgc gcacaatcct cgctcggctg cggctcccgg gtgtcccagg 60 cccggccggt aagcagagca tggccggtgc gggcccgaag cggcgcgcgc tagcggcccc 120 ggtggccgag gagaaggaag aggcgcggga gaagataatg gccgccaagc gcgcggacgg 180 cgcggcgccg gcaggcgagg gcgagggcgt gaccctgcag gggaacatca cgctactcaa 240 gggcgtggcc gtcatcgtgg tcgccatcat gggctcgggc atcttcgtga cgcccacggg 300 cgtgctcaag gaggcaggct cgccggggct ggcgctggtg gtgtgggccg cgtgcggcgt 360 cttctccatc gtgggcgcgc tctgctacgc ggagctgggc accaccatct ccaaatcggg 420 cggcgactac gcctacatgc tggacgtcta cggctcgctg cccgccttcc tcaagctctg 480 gatcgagctg ctcatcatcc ggccttcatc gcagtacatc gtggccctgg tcttcgccac 540 ctacctgctc aagccgctct tccccacctg cccggtgccc gaggaggcag ccaagctcgt 600 agcctgccac tccgtgcatt gacttgatgc acaacatcac aaaggcgatt tttgagaact 660 gatagtgcat cagccgaccc aggtaattta aaatattctt catccagaga tagaggtggt 720 tcttcctctt acggactgca accttcaaat tcagctgtgg tgtctcggca aaggcacgat 780 gataccagag tccacgctga catacagaat gacgaaaagg agagatcaat gtcttattgt 840 gatgagtctc gactgtcaaa tcttcttcgg aggatcaccc gggaagacga cagagactga 900 agattggtca ctgtaaagca gttgaaagaa tttattcagc aaccagaaaa taagctggta 960 ctagttaaac aattggataa tatcttggct gctgcacatg atgtgcttaa tgaaagtagc 1020 aaattgcttc aggagttgag acaggaggga gcttgctgtc ttggccttct ttgtgcttct 1080 ctgagctatg aggctgagaa gatcttcaag tggattttta gcaaatttag ctcatctgca 1140 aaagatgaag ttaaactcct ctacttatgt gccacctaca aagcactaga gactgtagga 1200 gaaaagaaag ccttttcatc tgtaatgcag cttgtaatga ccagcctgca gtcaattctt 1260 gaaaatgtgg atacaccaga attgctttgc aaatgtgtta agtgcattct tttggtggct 1320 cgatgttacc ctcatatttt cagcgctaat tttagggata cagttgatat attagttgga 1380 tggcatagag atcatactca gaaaccttcg ctcacgcagc aggtatctgg gtggttgcag 1440 agtttggagc cattttgggt agctgatctt gcatttccta cgactcttct tggtcagttt 1500 ctagaagaca tggaagcata tgctgaggac ctcagccatg tggcctctgg ggaatcaggg 1560 atgaagacgt ccctcctcca tcagtgtcat caccaaagct ggctgcgctt ctccgggtat 1620 ttagtactgt gctgaggagc attggggaac gcttcagccc aattcgggtc ctccaattac 1680 tgaggcatac gtaacagttg ttctgtacag agtaatgaga tgtgtgacgg ctgcagacca 1740 ggtgtttttt tctgaggctg tgttgacagc tgctaatgag tgtgttggtg ttttgctcgg 1800 cagcttggat cctagcatga ctatacattg tgacatggtc attacatatg gattagacca 1860 actggagaat tgccagactt gtggtaccga ttataccatc tcagtcttga atttactcac 1920 gctgattgtt gaacagataa atacgaaact gccatcatca tttgtagaaa aactgtttat 1980 accatcatct aaactactat tcttgcgtta tcataaagaa aaagaggttg ttgctgtagc 2040 ccatgctgtt tatcaagcaa tgctcagctt gaagaatatt cctgttttgg agactgccta 2100 taagttaaat tggg 2114 36 2133 DNA Homo sapiens 36 gcgggcggcg agcggcgcgc acaatcctcg ctcggctgcg gctctcgggt gtcccaggcc 60 cggccggtaa gcagagcatg gcgggtgcgg gcccgaagcg gcgggcgcta gcggccccgg 120 tggccgagga gaaggaagag gcgcgggaga agatgctggc ctccaagcgc gcggacggcg 180 cggcgccggc aggcgagggc gagggcgtga ccctgcagcg gaacatcacg ctactcaacg 240 gcgtggccat catcgtgggc gccatcatcg gctcgggcat cttcgtgacg cccacgggcg 300 tgcttaagga ggcaggctcg ccggggctgg cgctggtgat gtgggccgcg tgcggcgtct 360 tctccatcgt gggcgcgctc tgctacgcgg agctgggcac caccatctcc aaatcgggcg 420 gcgactacgc ctacatgctg gacgtctacg gctcgctgcc cgccttcctc aagctctgga 480 tcgagctgct cgtcatccgg ccttcatcgc agtacatcgt ggccctggtc ttcgccacct 540 acctgctcaa gccgctcttc cccagctgcc cggtgcccga ggaggcagcc aagctcatgg 600 cctgccactg cgtgcattga cttgatgcaa aacatcacaa aggctccatg ttacacagta 660 tgcgattttt gagaactgat agtgcatcag ctgacccagg taatttaaaa tattcttcat 720 ccagagatag aggtggttct tcctcttacg gactgcaacc ttcaaattca gctgtggtgt 780 ctcggcaaag gcacgatgat accagagtcc acgctgacat acagaatgac gaaaaggaga 840 gatcgatgtc ttattgtgat gagtctcgac tgtcaaatct tcttcggagg atcacccggg 900 aagacgacag agactgaaga ttggtcactg taaagcagtt gaaagaattt attcagcaac 960 cagaaaataa gctggtacta gttaaacaat tggataatat cttggctgct gcacatgatg 1020 tgcttaatga aagtagcaaa ttgcttcagg agttgagaca ggagggagct tgctgtcttg 1080 gccttctttg tgcttctctg agctatgagg ctgagaagat cttcaagtgg atttttagca 1140 aatttagctc atctgcaaaa gatgaagtta aactcctcta cttatgtgcc acctacaaag 1200 cactagagac tgtaggagaa aagaaagcct tttcatctgt aatgcagctt gtaatgacca 1260 gcctgcagtc aattcttgaa aatgtggata caccagaatt gctttgcaaa tgtgttaagt 1320 gcattctttt ggtggctcga tgttaccctc atattttcag cgctaatttt agggatacag 1380 ttgatatatt agttggatgg catagagatc atactcagaa accttcgctc acgcagcagg 1440 tatctgggtg gttgcagagt ttggagccat tttgggtagc tgatcttgca tttcctacga 1500 ctcttcttgg tcagtttcta gaagacatgg aagcatatgc tgaggacctc agccatgtgg 1560 cctctgggga atcagtggat gaagacgtcc ctcctccatc agtgtcatca ccaaagctgg 1620 ctgcgcttct ccgggtattt agtactgtgc tgaggagcat tggggaacgc ttcagcccaa 1680 ttcgggtcct ccaattactg aggcatacgt aacagttgtt ctgtacagag taatgagatg 1740 tgtgacggct gcaaaccagg tgtttttttc tgaggctgtg ttgacagctg ctaatgagtg 1800 tgttggtgtt ttgctcggca gcttggatcc tagcatgact atacattgtg acatggtcat 1860 tacatatgga ttagaccaac tggagaattg ccagacttgt ggtaccaatt atatcatctc 1920 agtcttgaat ttactcacgc tgattgttga acagataaat acgaaactgc catcatcatt 1980 tgtagaaaaa ctgtttatac catcatctaa actactattc ttgcgttatc ataaagaaaa 2040 agaggttgtt gctgtagccc atgctgttta tcaagcaatg ctcagcttga agaatattcc 2100 tgttttggag actgcctata agttaatatt ggg 2133 37 524 PRT Homo sapiens 37 Ser Leu Gly Arg Gly Ser Arg Val Ser Gln Ala Arg Pro Val Arg Arg 1 5 10 15 Ala Trp Arg Val Arg Ala Arg Lys Arg Arg Ala Leu Ala Ala Pro Ala 20 25 30 Ala Glu Lys Lys Lys Glu Ala Arg Glu Lys Met Leu Ala Ala Lys Ser 35 40 45 Ala Asp Gly Ser Ala Pro Ala Gly Glu Gly Glu Gly Val Thr Leu Gln 50 55 60 Arg Asn Ile Thr Leu Leu Asn Gly Val Ala Ile Ile Val Gly Thr Ile 65 70 75 80 Ile Gly Ser Gly Ile Phe Val Thr Pro Thr Gly Val Leu Lys Glu Ala 85 90 95 Gly Ser Pro Gly Leu Ala Leu Val Val Trp Ala Ala Cys Gly Val Phe 100 105 110 Ser Ile Val Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr Ile Ser 115 120 125 Lys Ser Gly Gly Asp Tyr Ala Tyr Met Leu Glu Val Tyr Gly Ser Leu 130 135 140 Pro Ala Phe Leu Lys Leu Trp Ile Glu Leu Leu Ile Ile Arg Pro Ser 145 150 155 160 Ser Gln Tyr Ile Val Ala Leu Val Phe Ala Thr Tyr Leu Leu Lys Pro 165 170 175 Leu Phe Pro Thr Cys Pro Val Pro Glu Glu Ala Ala Lys Leu Val Ala 180 185 190 Cys Leu Cys Val Leu Leu Leu Thr Ala Val Asn Cys Tyr Ser Val Lys 195 200 205 Ala Ala Thr Arg Val Gln Asp Ala Phe Ala Ala Ala Lys Leu Leu Ala 210 215 220 Leu Ala Leu Ile Ile Leu Leu Gly Phe Val Gln Ile Gly Lys Gly Asp 225 230 235 240 Val Ser Asn Leu Asp Pro Asn Phe Ser Phe Glu Gly Thr Lys Leu Asp 245 250 255 Val Gly Asn Ile Val Leu Ala Leu Tyr Ser Gly Leu Phe Ala Tyr Gly 260 265 270 Gly Trp Asn Tyr Leu Asn Phe Val Thr Glu Glu Met Ile Asn Pro Tyr 275 280 285 Arg Asn Leu Pro Leu Ala Ile Ile Ile Ser Leu Pro Ile Val Thr Leu 290 295 300 Val Tyr Val Leu Thr Asn Leu Ala Tyr Phe Thr Thr Leu Ser Thr Glu 305 310 315 320 Gln Met Leu Ser Ser Glu Ala Val Ala Val Asp Phe Gly Asn Tyr His 325 330 335 Leu Gly Val Met Ser Trp Ile Ile Pro Val Phe Val Gly Leu Ser Cys 340 345 350 Phe Gly Ser Val Asn Gly Ser Leu Phe Thr Ser Ser Arg Leu Phe Phe 355 360 365 Val Gly Ser Arg Glu Gly His Leu Pro Ser Ile Leu Ser Met Ile His 370 375 380 Pro Gln Leu Leu Thr Pro Val Pro Ser Leu Val Phe Thr Cys Val Met 385 390 395 400 Thr Leu Leu Tyr Ala Phe Ser Lys Asp Ile Phe Ser Val Ile Asn Phe 405 410 415 Phe Ser Phe Phe Asn Trp Leu Cys Val Ala Leu Ala Ile Ile Gly Met 420 425 430 Ile Trp Leu Arg His Arg Lys Pro Glu Leu Glu Arg Pro Ile Lys Val 435 440 445 Asn Leu Ala Leu Pro Val Phe Phe Ile Leu Ala Cys Leu Phe Leu Ile 450 455 460 Ala Val Ser Phe Trp Lys Thr Pro Val Glu Cys Gly Ile Gly Phe Thr 465 470 475 480 Ile Ile Leu Ser Gly Leu Pro Val Tyr Phe Phe Gly Val Trp Trp Lys 485 490 495 Asn Lys Pro Lys Trp Leu Leu Gln Gly Ile Phe Ser Thr Thr Val Leu 500 505 510 Cys Gln Lys Leu Met Gln Val Val Pro Gln Glu Thr 515 520 38 507 PRT Homo sapiens 38 Met Ala Gly Ala Gly Pro Lys Arg Arg Ala Leu Ala Ala Pro Val Ala 1 5 10 15 Glu Glu Lys Glu Glu Ala Arg Glu Lys Met Leu Ala Ser Lys Arg Ala 20 25 30 Asp Gly Ala Ala Pro Ala Gly Glu Gly Glu Gly Val Thr Leu Gln Arg 35 40 45 Asn Ile Thr Leu Leu Asn Gly Val Ala Ile Ile Val Gly Ala Ile Ile 50 55 60 Gly Ser Gly Ile Phe Val Thr Pro Thr Gly Val Leu Lys Glu Ala Gly 65 70 75 80 Ser Pro Gly Leu Ala Leu Val Met Trp Ala Ala Cys Gly Val Phe Ser 85 90 95 Ile Val Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr Ile Ser Lys 100 105 110 Ser Gly Gly Asp Tyr Ala Tyr Met Leu Glu Val Tyr Gly Ser Leu Pro 115 120 125 Ala Phe Leu Lys Leu Trp Ile Glu Leu Leu Ile Ile Arg Pro Ser Ser 130 135 140 Gln Tyr Ile Val Ala Leu Val Phe Ala Ala Tyr Leu Leu Lys Pro Leu 145 150 155 160 Phe Pro Thr Cys Pro Val Pro Glu Glu Ala Ala Lys Leu Val Ala Cys 165 170 175 Leu Cys Val Leu Leu Leu Thr Ala Val Asn Cys Tyr Ser Val Lys Ala 180 185 190 Ala Thr Arg Val Gln Asp Ala Phe Ala Ala Ala Lys Leu Leu Ala Leu 195 200 205 Ala Leu Ile Ile Leu Leu Gly Phe Val Gln Ile Gly Lys Gly Asp Val 210 215 220 Ser Asn Leu Asp Pro Asn Phe Ser Phe Glu Gly Thr Lys Leu Asp Val 225 230 235 240 Gly Asn Ile Val Leu Ala Leu Tyr Ser Gly Leu Phe Ala Tyr Gly Gly 245 250 255 Trp Asn Tyr Leu Asn Phe Val Thr Glu Glu Met Ile Asn Pro Tyr Arg 260 265 270 Asn Leu Pro Leu Ala Ile Ile Ile Ser Leu Pro Ile Val Thr Leu Val 275 280 285 Tyr Val Leu Thr Asn Leu Ala Tyr Phe Thr Thr Leu Ser Thr Glu Gln 290 295 300 Met Leu Ser Ser Glu Ala Val Ala Val Asp Phe Gly Asn Tyr His Leu 305 310 315 320 Gly Val Met Ser Trp Ile Ile Pro Val Phe Val Gly Leu Ser Cys Phe 325 330 335 Gly Ser Val Asn Gly Ser Leu Phe Thr Ser Ser Arg Leu Phe Phe Val 340 345 350 Gly Ser Arg Glu Gly His Leu Pro Ser Ile Leu Ser Met Ile His Pro 355 360 365 Gln Leu Leu Thr Pro Val Pro Ser Leu Val Phe Thr Cys Val Met Thr 370 375 380 Leu Leu Tyr Ala Phe Ser Lys Asp Ile Phe Ser Val Ile Asn Phe Phe 385 390 395 400 Ser Phe Phe Asn Trp Leu Cys Val Ala Leu Ala Ile Ile Gly Met Ile 405 410 415 Trp Leu Arg His Arg Lys Pro Glu Leu Glu Arg Pro Ile Lys Val Asn 420 425 430 Leu Ala Leu Pro Val Phe Phe Ile Leu Ala Cys Leu Phe Leu Ile Ala 435 440 445 Val Ser Phe Trp Lys Thr Pro Val Glu Cys Gly Ile Gly Phe Thr Ile 450 455 460 Ile Leu Ser Gly Leu Pro Val Tyr Phe Phe Gly Val Trp Trp Lys Asn 465 470 475 480 Lys Pro Lys Trp Leu Leu Gln Gly Ile Phe Ser Thr Thr Val Leu Cys 485 490 495 Gln Lys Leu Met Gln Val Val Pro Gln Glu Thr 500 505 39 507 PRT Homo sapiens 39 Met Ala Gly Ala Gly Pro Lys Arg Arg Ala Leu Ala Ala Pro Val Ala 1 5 10 15 Glu Glu Lys Glu Glu Ala Arg Glu Lys Met Leu Ala Ser Lys Arg Ala 20 25 30 Asp Gly Ala Ala Pro Ala Gly Glu Gly Glu Gly Val Thr Leu Gln Arg 35 40 45 Asn Ile Thr Leu Leu Asn Gly Val Ala Ile Ile Val Gly Ala Ile Ile 50 55 60 Gly Ser Gly Ile Phe Val Thr Pro Thr Gly Val Leu Lys Glu Ala Gly 65 70 75 80 Ser Pro Gly Leu Ala Leu Val Met Trp Ala Ala Cys Gly Val Phe Ser 85 90 95 Ile Val Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr Ile Ser Lys 100 105 110 Ser Gly Gly Asp Tyr Ala Tyr Met Leu Glu Val Tyr Gly Ser Leu Pro 115 120 125 Ala Phe Leu Lys Leu Trp Ile Glu Leu Leu Ile Ile Arg Pro Ser Ser 130 135 140 Gln Tyr Ile Val Ala Leu Val Phe Ala Ala Tyr Leu Leu Lys Pro Leu 145 150 155 160 Phe Pro Thr Cys Pro Val Pro Glu Glu Ala Ala Lys Leu Val Ala Cys 165 170 175 Leu Cys Val Leu Leu Leu Thr Ala Val Asn Cys Tyr Ser Val Lys Ala 180 185 190 Ala Thr Arg Val Gln Asp Ala Phe Ala Ala Ala Lys Leu Leu Ala Leu 195 200 205 Ala Leu Ile Ile Leu Leu Gly Phe Val Gln Ile Gly Lys Gly Asp Val 210 215 220 Ser Asn Leu Asp Pro Asn Phe Ser Phe Glu Gly Thr Lys Leu Asp Val 225 230 235 240 Gly Asn Ile Val Leu Ala Leu Tyr Ser Gly Leu Phe Ala Tyr Gly Gly 245 250 255 Trp Asn Tyr Leu Asn Phe Val Thr Glu Glu Met Ile Asn Pro Tyr Arg 260 265 270 Asn Leu Pro Leu Ala Ile Ile Ile Ser Leu Pro Ile Val Thr Leu Val 275 280 285 Tyr Val Leu Thr Asn Leu Ala Tyr Phe Thr Thr Leu Ser Thr Glu Gln 290 295 300 Met Leu Ser Ser Glu Ala Val Ala Val Asp Phe Gly Asn Tyr His Leu 305 310 315 320 Gly Val Met Ser Trp Ile Ile Pro Val Phe Val Gly Leu Ser Cys Phe 325 330 335 Gly Ser Val Asn Gly Ser Leu Phe Thr Ser Ser Arg Leu Phe Phe Val 340 345 350 Gly Ser Arg Glu Gly His Leu Pro Ser Ile Leu Ser Met Ile His Pro 355 360 365 Gln Leu Leu Thr Pro Val Pro Ser Leu Val Phe Thr Cys Val Met Thr 370 375 380 Leu Leu Tyr Ala Phe Ser Lys Asp Ile Phe Ser Val Ile Asn Phe Phe 385 390 395 400 Ser Phe Phe Asn Trp Leu Cys Val Ala Leu Ala Ile Ile Gly Met Ile 405 410 415 Trp Leu Arg His Arg Lys Pro Glu Leu Glu Arg Pro Ile Lys Val Asn 420 425 430 Leu Ala Leu Pro Val Phe Phe Ile Leu Ala Cys Leu Phe Leu Ile Ala 435 440 445 Val Ser Phe Trp Lys Thr Pro Val Glu Cys Gly Ile Gly Phe Thr Ile 450 455 460 Ile Leu Ser Gly Leu Pro Val Tyr Phe Phe Gly Val Trp Trp Lys Asn 465 470 475 480 Lys Pro Lys Trp Leu Leu Gln Gly Ile Phe Ser Thr Thr Val Leu Cys 485 490 495 Gln Lys Leu Met Gln Val Val Pro Gln Glu Thr 500 505 40 515 PRT Homo sapiens 40 Met Glu Ala Arg Glu Pro Gly Arg Pro Thr Pro Thr Tyr His Leu Val 1 5 10 15 Pro Asn Thr Ser Gln Ser Gln Val Glu Glu Asp Val Ser Ser Pro Pro 20 25 30 Gln Arg Ser Ser Glu Thr Met Gln Leu Lys Lys Glu Ile Ser Leu Leu 35 40 45 Asn Gly Val Ser Leu Val Val Gly Asn Met Ile Gly Ser Gly Ile Phe 50 55 60 Val Ser Pro Lys Gly Val Leu Val His Thr Ala Ser Tyr Gly Met Ser 65 70 75 80 Leu Ile Val Trp Ala Ile Gly Gly Leu Phe Ser Val Val Gly Ala Leu 85 90 95 Cys Tyr Ala Glu Leu Gly Thr Thr Ile Thr Lys Ser Gly Ala Ser Tyr 100 105 110 Ala Tyr Ile Leu Glu Ala Phe Gly Gly Phe Ile Ala Phe Ile Arg Leu 115 120 125 Trp Val Ser Leu Leu Val Val Glu Pro Thr Gly Gln Ala Ile Ile Ala 130 135 140 Ile Thr Phe Ala Asn Tyr Ile Ile Gln Pro Ser Phe Pro Ser Cys Asp 145 150 155 160 Pro Pro Tyr Leu Ala Cys Arg Leu Leu Ala Ala Ala Cys Ile Cys Leu 165 170 175 Leu Thr Phe Val Asn Cys Ala Tyr Val Lys Trp Gly Thr Arg Val Gln 180 185 190 Asp Thr Phe Thr Tyr Ala Lys Val Val Ala Leu Ile Ala Ile Ile Val 195 200 205 Met Gly Leu Val Lys Leu Cys Gln Gly His Ser Glu His Phe Gln Asp 210 215 220 Ala Phe Glu Gly Ser Ser Trp Asp Met Gly Asn Leu Ser Leu Ala Leu 225 230 235 240 Tyr Ser Ala Leu Phe Ser Tyr Ser Gly Trp Asp Thr Leu Asn Phe Val 245 250 255 Thr Glu Glu Ile Lys Asn Pro Glu Arg Asn Leu Pro Leu Ala Ile Gly 260 265 270 Ile Ser Met Pro Ile Val Thr Leu Ile Tyr Ile Leu Thr Asn Val Ala 275 280 285 Tyr Tyr Thr Val Leu Asn Ile Ser Asp Val Leu Ser Ser Asp Ala Val 290 295 300 Ala Val Thr Phe Ala Asp Gln Thr Phe Gly Met Phe Ser Trp Thr Ile 305 310 315 320 Pro Ile Ala Val Ala Leu Ser Cys Phe Gly Gly Leu Asn Ala Ser Ile 325 330 335 Phe Ala Ser Ser Arg Leu Phe Phe Val Gly Ser Arg Glu Gly His Leu 340 345 350 Pro Asp Leu Leu Ser Met Ile His Ile Glu Arg Phe Thr Pro Ile Pro 355 360 365 Ala Leu Leu Phe Asn Cys Thr Met Ala Leu Ile Tyr Leu Ile Val Glu 370 375 380 Asp Val Phe Gln Leu Ile Asn Tyr Phe Ser Phe Ser Tyr Trp Phe Phe 385 390 395 400 Val Gly Leu Ser Val Val Gly Gln Leu Tyr Leu Arg Trp Lys Glu Pro 405 410 415 Lys Arg Pro Arg Pro Leu Lys Leu Ser Val Phe Phe Pro Ile Val Phe 420 425 430 Cys Ile Cys Ser Val Phe Leu Val Ile Val Pro Leu Phe Thr Asp Thr 435 440 445 Ile Asn Ser Leu Ile Gly Ile Gly Ile Ala Leu Ser Gly Val Pro Phe 450 455 460 Tyr Phe Met Gly Val Tyr Leu Pro Glu Ser Arg Arg Pro Leu Phe Ile 465 470 475 480 Arg Asn Val Leu Ala Ala Ile Thr Arg Gly Thr Gln Gln Leu Cys Phe 485 490 495 Cys Val Leu Thr Glu Leu Asp Val Ala Glu Glu Lys Lys Asp Glu Arg 500 505 510 Lys Thr Asp 515 41 511 PRT Homo sapiens 41 Met Val Asp Ser Thr Glu Tyr Glu Val Ala Ser Gln Pro Glu Val Glu 1 5 10 15 Thr Ser Pro Leu Gly Asp Gly Ala Ser Pro Gly Pro Glu Gln Val Lys 20 25 30 Leu Lys Lys Glu Ile Ser Leu Leu Asn Gly Val Cys Leu Ile Val Gly 35 40 45 Asn Met Ile Gly Ser Gly Ile Phe Val Ser Pro Lys Gly Val Leu Ile 50 55 60 Tyr Ser Ala Ser Phe Gly Leu Ser Leu Val Ile Trp Ala Val Gly Gly 65 70 75 80 Leu Phe Ser Val Phe Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr 85 90 95 Ile Lys Lys Ser Gly Ala Ser Tyr Ala Tyr Ile Leu Glu Ala Phe Gly 100 105 110 Gly Phe Leu Ala Phe Ile Arg Leu Trp Thr Ser Leu Leu Ile Ile Glu 115 120 125 Pro Thr Ser Gln Ala Ile Ile Ala Ile Thr Phe Ala Asn Tyr Met Val 130 135 140 Gln Pro Leu Phe Pro Ser Cys Phe Ala Pro Tyr Ala Ala Ser Arg Leu 145 150 155 160 Leu Ala Ala Ala Cys Ile Cys Leu Leu Thr Phe Ile Asn Cys Ala Tyr 165 170 175 Val Lys Trp Gly Thr Leu Val Gln Asp Ile Phe Thr Tyr Ala Lys Val 180 185 190 Leu Ala Leu Ile Ala Val Ile Val Ala Gly Ile Val Arg Leu Gly Gln 195 200 205 Gly Ala Ser Thr His Phe Glu Asn Ser Phe Glu Gly Ser Ser Phe Ala 210 215 220 Val Gly Asp Ile Ala Leu Ala Leu Tyr Ser Ala Leu Phe Ser Tyr Ser 225 230 235 240 Gly Trp Asp Thr Leu Asn Tyr Val Thr Glu Glu Ile Lys Asn Pro Glu 245 250 255 Arg Asn Leu Pro Leu Ser Ile Gly Ile Ser Met Pro Ile Val Thr Ile 260 265 270 Ile Tyr Ile Leu Thr Asn Val Ala Tyr Tyr Thr Val Leu Asp Met Arg 275 280 285 Asp Ile Leu Ala Ser Asp Ala Val Ala Val Thr Phe Ala Asp Gln Ile 290 295 300 Phe Gly Ile Phe Asn Trp Ile Ile Pro Leu Ser Val Ala Leu Ser Cys 305 310 315 320 Phe Gly Gly Leu Asn Ala Ser Ile Val Ala Ala Ser Arg Leu Phe Phe 325 330 335 Val Gly Ser Arg Glu Gly His Leu Pro Asp Ala Ile Cys Met Ile His 340 345 350 Val Glu Arg Phe Thr Pro Val Pro Ser Leu Leu Phe Asn Gly Ile Met 355 360 365 Ala Leu Ile Tyr Leu Cys Val Glu Asp Ile Phe Gln Leu Ile Asn Tyr 370 375 380 Tyr Ser Phe Ser Tyr Trp Phe Phe Val Gly Leu Ser Ile Val Gly Gln 385 390 395 400 Leu Tyr Leu Arg Trp Lys Glu Pro Asp Arg Pro Arg Pro Leu Lys Leu 405 410 415 Ser Val Phe Phe Pro Ile Val Phe Cys Leu Cys Thr Ile Phe Leu Val 420 425 430 Ala Val Pro Leu Tyr Ser Asp Thr Ile Asn Ser Leu Ile Gly Ile Ala 435 440 445 Ile Ala Leu Ser Gly Leu Pro Phe Tyr Phe Leu Ile Ile Arg Val Pro 450 455 460 Glu His Lys Arg Pro Leu Tyr Leu Arg Arg Ile Val Gly Ser Ala Thr 465 470 475 480 Arg Tyr Leu Gln Val Leu Cys Met Ser Val Ala Ala Glu Met Asp Leu 485 490 495 Glu Asp Gly Gly Glu Met Pro Lys Gln Arg Asp Pro Lys Ser Asn 500 505 510 42 511 PRT Homo sapiens 42 Met Val Asp Ser Thr Glu Tyr Glu Val Ala Ser Gln Pro Glu Val Glu 1 5 10 15 Thr Ser Pro Leu Gly Asp Gly Ala Ser Pro Gly Pro Glu Gln Val Lys 20 25 30 Leu Lys Lys Glu Ile Ser Leu Leu Asn Gly Val Cys Leu Ile Val Gly 35 40 45 Asn Met Ile Gly Ser Gly Ile Phe Val Ser Pro Lys Gly Val Leu Ile 50 55 60 Tyr Ser Ala Ser Phe Gly Leu Ser Leu Val Ile Trp Ala Val Gly Gly 65 70 75 80 Leu Phe Ser Val Phe Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr 85 90 95 Ile Lys Lys Ser Gly Ala Ser Tyr Ala Tyr Ile Leu Glu Ala Phe Gly 100 105 110 Gly Phe Leu Ala Phe Ile Arg Leu Trp Thr Ser Leu Leu Ile Ile Glu 115 120 125 Pro Thr Ser Gln Ala Ile Ile Ala Ile Thr Phe Ala Asn Tyr Met Val 130 135 140 Gln Pro Leu Phe Pro Ser Cys Phe Ala Pro Tyr Ala Ala Ser Arg Leu 145 150 155 160 Leu Ala Ala Ala Cys Ile Cys Leu Leu Thr Phe Ile Asn Cys Ala Tyr 165 170 175 Val Lys Trp Gly Thr Leu Val Gln Asp Ile Phe Thr Tyr Ala Lys Val 180 185 190 Leu Ala Leu Ile Ala Val Ile Val Ala Gly Ile Val Arg Leu Gly Gln 195 200 205 Gly Ala Ser Thr His Phe Glu Asn Ser Phe Glu Gly Ser Ser Phe Ala 210 215 220 Val Gly Asp Ile Ala Leu Ala Leu Tyr Ser Ala Leu Phe Ser Tyr Ser 225 230 235 240 Gly Trp Asp Thr Leu Asn Tyr Val Thr Glu Glu Ile Lys Asn Pro Glu 245 250 255 Arg Asn Leu Pro Leu Ser Ile Gly Ile Ser Met Pro Ile Val Thr Ile 260 265 270 Ile Tyr Ile Leu Thr Asn Val Ala Tyr Tyr Thr Val Leu Asp Met Arg 275 280 285 Asp Ile Leu Ala Ser Asp Ala Val Ala Val Thr Phe Ala Asp Gln Ile 290 295 300 Phe Gly Ile Phe Asn Trp Ile Ile Pro Leu Ser Val Ala Leu Ser Cys 305 310 315 320 Phe Gly Gly Leu Asn Ala Ser Ile Val Ala Ala Ser Arg Leu Phe Phe 325 330 335 Val Gly Ser Arg Glu Gly His Leu Pro Asp Ala Ile Cys Met Ile His 340 345 350 Val Glu Arg Phe Thr Pro Val Pro Ser Leu Leu Phe Asn Gly Ile Met 355 360 365 Ala Leu Ile Tyr Leu Cys Val Glu Asp Ile Phe Gln Leu Ile Asn Tyr 370 375 380 Tyr Ser Phe Ser Tyr Trp Phe Phe Val Gly Leu Ser Ile Val Gly Gln 385 390 395 400 Leu Tyr Leu Arg Trp Lys Glu Pro Asp Arg Pro Arg Pro Leu Lys Leu 405 410 415 Ser Val Phe Phe Pro Ile Val Phe Cys Leu Cys Thr Ile Phe Leu Val 420 425 430 Ala Val Pro Leu Tyr Ser Asp Thr Ile Asn Ser Leu Ile Gly Ile Ala 435 440 445 Ile Ala Leu Ser Gly Leu Pro Phe Tyr Phe Leu Ile Ile Arg Val Pro 450 455 460 Glu His Lys Arg Pro Leu Tyr Leu Arg Arg Ile Val Gly Ser Ala Thr 465 470 475 480 Arg Tyr Leu Gln Val Leu Cys Met Ser Val Ala Ala Glu Met Asp Leu 485 490 495 Glu Asp Gly Gly Glu Met Pro Lys Gln Arg Asp Pro Lys Ser Asn 500 505 510 43 535 PRT Homo sapiens 43 Met Glu Glu Gly Ala Arg His Arg Asn Asn Thr Glu Lys Lys His Pro 1 5 10 15 Gly Gly Gly Glu Ser Asp Ala Ser Pro Glu Ala Gly Ser Gly Gly Gly 20 25 30 Gly Val Ala Leu Lys Lys Glu Ile Gly Leu Val Ser Ala Cys Gly Ile 35 40 45 Ile Val Gly Asn Ile Ile Gly Ser Gly Ile Phe Val Ser Pro Lys Gly 50 55 60 Val Leu Glu Asn Ala Gly Ser Val Gly Leu Ala Leu Ile Val Trp Ile 65 70 75 80 Val Thr Gly Phe Ile Thr Val Val Gly Ala Leu Cys Tyr Ala Glu Leu 85 90 95 Gly Val Thr Ile Pro Lys Ser Gly Gly Asp Tyr Ser Tyr Val Lys Asp 100 105 110 Ile Phe Gly Gly Leu Ala Gly Phe Leu Arg Leu Trp Ile Ala Val Leu 115 120 125 Val Ile Tyr Pro Thr Asn Gln Ala Val Ile Ala Leu Thr Phe Ser Asn 130 135 140 Tyr Val Leu Gln Pro Leu Phe Pro Thr Cys Phe Pro Pro Glu Ser Gly 145 150 155 160 Leu Arg Leu Leu Ala Ala Ile Cys Leu Leu Leu Leu Thr Trp Val Asn 165 170 175 Cys Ser Ser Val Arg Trp Ala Thr Arg Val Gln Asp Ile Phe Thr Ala 180 185 190 Gly Lys Leu Leu Ala Leu Ala Leu Ile Ile Ile Met Gly Ile Val Gln 195 200 205 Ile Cys Lys Gly Glu Tyr Phe Trp Leu Glu Pro Lys Asn Ala Phe Glu 210 215 220 Asn Phe Gln Glu Pro Asp Ile Gly Leu Val Ala Leu Ala Phe Leu Gln 225 230 235 240 Gly Ser Phe Ala Tyr Gly Gly Trp Asn Phe Leu Asn Tyr Val Thr Glu 245 250 255 Glu Leu Val Asp Pro Tyr Lys Asn Leu Pro Arg Ala Ile Phe Ile Ser 260 265 270 Ile Pro Leu Val Thr Phe Val Tyr Val Phe Ala Asn Val Ala Tyr Val 275 280 285 Thr Ala Met Ser Pro Gln Glu Leu Leu Ala Ser Asn Ala Val Ala Val 290 295 300 Thr Phe Gly Glu Lys Leu Leu Gly Val Met Ala Trp Ile Met Pro Ile 305 310 315 320 Ser Val Ala Leu Ser Thr Phe Gly Gly Val Asn Gly Ser Leu Phe Thr 325 330 335 Ser Ser Arg Leu Phe Phe Ala Gly Ala Arg Glu Gly His Leu Pro Ser 340 345 350 Val Leu Ala Met Ile His Val Lys Arg Cys Thr Pro Ile Pro Ala Leu 355 360 365 Leu Phe Thr Cys Ile Ser Thr Leu Leu Met Leu Val Thr Ser Asp Met 370 375 380 Tyr Thr Leu Ile Asn Tyr Val Gly Phe Ile Asn Tyr Leu Phe Tyr Gly 385 390 395 400 Val Thr Val Ala Gly Gln Ile Val Leu Arg Trp Lys Lys Pro Asp Ile 405 410 415 Pro Arg Pro Ile Lys Ile Asn Leu Leu Phe Pro Ile Ile Tyr Leu Leu 420 425 430 Phe Trp Ala Phe Leu Leu Val Phe Ser Leu Trp Ser Glu Pro Val Val 435 440 445 Cys Gly Ile Gly Leu Ala Ile Met Leu Thr Gly Val Pro Val Tyr Phe 450 455 460 Leu Gly Val Tyr Trp Gln His Lys Pro Lys Cys Phe Ser Asp Phe Ile 465 470 475 480 Glu Leu Leu Thr Leu Val Ser Gln Lys Met Cys Val Val Val Tyr Pro 485 490 495 Glu Val Glu Arg Gly Ser Gly Thr Glu Glu Ala Asn Glu Asp Met Glu 500 505 510 Glu Gln Gln Gln Pro Met Tyr Gln Pro Thr Pro Thr Lys Asp Lys Asp 515 520 525 Val Ala Gly Gln Pro Gln Pro 530 535 44 535 PRT Homo sapiens 44 Met Glu Glu Gly Ala Arg His Arg Asn Asn Thr Glu Lys Lys His Pro 1 5 10 15 Gly Gly Gly Glu Ser Asp Ala Ser Pro Glu Ala Gly Ser Gly Gly Gly 20 25 30 Gly Val Ala Leu Lys Lys Glu Ile Gly Leu Val Ser Ala Cys Gly Ile 35 40 45 Ile Val Gly Asn Ile Ile Gly Ser Gly Ile Phe Val Ser Pro Lys Gly 50 55 60 Val Leu Glu Asn Ala Gly Ser Val Gly Leu Ala Leu Ile Val Trp Ile 65 70 75 80 Val Thr Gly Phe Ile Thr Val Val Gly Ala Leu Cys Tyr Ala Glu Leu 85 90 95 Gly Val Thr Ile Pro Lys Ser Gly Gly Asp Tyr Ser Tyr Val Lys Asp 100 105 110 Ile Phe Gly Gly Leu Ala Gly Phe Leu Arg Leu Trp Ile Ala Val Leu 115 120 125 Val Ile Tyr Pro Thr Asn Gln Ala Val Ile Ala Leu Thr Phe Ser Asn 130 135 140 Tyr Val Leu Gln Pro Leu Phe Pro Thr Cys Phe Pro Pro Glu Ser Gly 145 150 155 160 Leu Arg Leu Leu Ala Ala Ile Cys Leu Leu Leu Leu Thr Trp Val Asn 165 170 175 Cys Ser Ser Val Arg Trp Ala Thr Arg Val Gln Asp Ile Phe Thr Ala 180 185 190 Gly Lys Leu Leu Ala Leu Ala Leu Ile Ile Ile Met Gly Ile Val Gln 195 200 205 Ile Cys Lys Gly Glu Tyr Phe Trp Leu Glu Pro Lys Asn Ala Phe Glu 210 215 220 Asn Phe Gln Glu Pro Asp Ile Gly Leu Val Ala Leu Ala Phe Leu Gln 225 230 235 240 Gly Ser Phe Ala Tyr Gly Gly Trp Asn Phe Leu Asn Tyr Val Thr Glu 245 250 255 Glu Leu Val Asp Pro Tyr Lys Asn Leu Pro Arg Ala Ile Phe Ile Ser 260 265 270 Ile Pro Leu Val Thr Phe Val Tyr Val Phe Ala Asn Val Ala Tyr Val 275 280 285 Thr Ala Met Ser Pro Gln Glu Leu Leu Ala Ser Asn Ala Val Ala Val 290 295 300 Thr Phe Gly Glu Lys Leu Leu Gly Val Met Ala Trp Ile Met Pro Ile 305 310 315 320 Ser Val Ala Leu Ser Thr Phe Gly Gly Val Asn Gly Ser Leu Phe Thr 325 330 335 Ser Ser Arg Leu Phe Phe Ala Gly Ala Arg Glu Gly His Leu Pro Ser 340 345 350 Val Leu Ala Met Ile His Val Lys Arg Cys Thr Pro Ile Pro Ala Leu 355 360 365 Leu Phe Thr Cys Ile Ser Thr Leu Leu Met Leu Val Thr Ser Asp Met 370 375 380 Tyr Thr Leu Ile Asn Tyr Val Gly Phe Ile Asn Tyr Leu Phe Tyr Gly 385 390 395 400 Val Thr Val Ala Gly Gln Ile Val Leu Arg Trp Lys Lys Pro Asp Ile 405 410 415 Pro Arg Pro Ile Lys Ile Asn Leu Leu Phe Pro Ile Ile Tyr Leu Leu 420 425 430 Phe Trp Ala Phe Leu Leu Val Phe Ser Leu Trp Ser Glu Pro Val Val 435 440 445 Cys Gly Ile Gly Leu Ala Ile Met Leu Thr Gly Val Pro Val Tyr Phe 450 455 460 Leu Gly Val Tyr Trp Gln His Lys Pro Lys Cys Phe Ser Asp Phe Ile 465 470 475 480 Glu Leu Leu Thr Leu Val Ser Gln Lys Met Cys Val Val Val Tyr Pro 485 490 495 Glu Val Glu Arg Gly Ser Gly Thr Glu Glu Ala Asn Glu Asp Met Glu 500 505 510 Glu Gln Gln Gln Pro Met Tyr Gln Pro Thr Pro Thr Lys Asp Lys Asp 515 520 525 Val Ala Gly Gln Pro Gln Pro 530 535 45 487 PRT Homo sapiens 45 Met Gly Asp Thr Gly Leu Arg Lys Arg Arg Glu Asp Glu Lys Ser Ile 1 5 10 15 Gln Ser Gln Glu Pro Lys Thr Thr Ser Leu Gln Lys Glu Leu Gly Leu 20 25 30 Ile Ser Gly Ile Ser Ile Ile Val Gly Thr Ile Ile Gly Ser Gly Ile 35 40 45 Phe Val Ser Ser Lys Ser Val Leu Ser Asn Thr Glu Ala Val Gly Pro 50 55 60 Cys Leu Ile Ile Trp Ala Ala Cys Gly Val Leu Ala Thr Leu Gly Ala 65 70 75 80 Leu Cys Phe Ala Glu Leu Gly Thr Met Ile Thr Lys Ser Gly Gly Glu 85 90 95 Tyr Pro Tyr Leu Met Glu Ala Tyr Gly Pro Ile Pro Ala Tyr Leu Phe 100 105 110 Ser Trp Ala Ser Leu Ile Val Ile Lys Pro Thr Ser Phe Ala Ile Ile 115 120 125 Cys Leu Ser Phe Ser Glu Tyr Val Cys Ala Pro Phe Tyr Val Gly Cys 130 135 140 Lys Pro Pro Gln Ile Val Val Lys Cys Leu Ala Ala Ala Ala Ile Leu 145 150 155 160 Phe Ile Ser Thr Val Asn Ser Leu Ser Val Arg Leu Gly Ser Tyr Val 165 170 175 Gln Asn Ile Phe Thr Ala Ala Lys Leu Val Ile Val Ala Ile Ile Ile 180 185 190 Ile Ser Gly Leu Val Leu Leu Ala Gln Gly Asn Thr Lys Asn Phe Asp 195 200 205 Asn Ser Phe Glu Gly Ala Gln Leu Ser Val Gly Ala Ile Ser Leu Ala 210 215 220 Phe Tyr Asn Gly Leu Trp Ala Tyr Asp Gly Trp Asn Gln Leu Asn Tyr 225 230 235 240 Ile Thr Glu Glu Leu Arg Asn Pro Tyr Arg Asn Leu Pro Leu Ala Ile 245 250 255 Ile Ile Gly Ile Pro Leu Val Thr Ala Cys Tyr Ile Leu Met Asn Val 260 265 270 Ser Tyr Phe Thr Val Met Thr Ala Thr Glu Leu Leu Gln Ser Gln Ala 275 280 285 Val Ala Val Thr Phe Gly Asp Arg Val Leu Tyr Pro Ala Ser Trp Ile 290 295 300 Val Pro Leu Phe Val Ala Phe Ser Thr Ile Gly Ala Ala Asn Gly Thr 305 310 315 320 Cys Phe Thr Ala Gly Arg Leu Ile Tyr Val Ala Gly Arg Glu Gly His 325 330 335 Met Leu Lys Val Leu Ser Tyr Ile Ser Val Arg Arg Leu Thr Pro Ala 340 345 350 Pro Ala Ile Ile Phe Tyr Gly Ile Ile Ala Thr Ile Tyr Ile Ile Pro 355 360 365 Gly Asp Ile Asn Ser Leu Val Asn Tyr Phe Ser Phe Ala Ala Trp Leu 370 375 380 Phe Tyr Gly Leu Thr Ile Leu Gly Leu Ile Val Met Arg Phe Thr Arg 385 390 395 400 Lys Glu Leu Glu Arg Pro Ile Lys Val Pro Val Val Ile Pro Val Leu 405 410 415 Met Thr Leu Ile Ser Val Phe Leu Val Leu Ala Pro Ile Ile Ser Lys 420 425 430 Pro Thr Trp Glu Tyr Leu Tyr Cys Val Leu Phe Ile Leu Ser Gly Leu 435 440 445 Leu Phe Tyr Phe Leu Phe Val His Tyr Lys Phe Gly Trp Ala Gln Lys 450 455 460 Ile Ser Lys Pro Ile Thr Met His Leu Gln Met Leu Met Glu Val Val 465 470 475 480 Pro Pro Glu Glu Asp Pro Glu 485 46 487 PRT Homo sapiens 46 Met Gly Asp Thr Gly Leu Arg Lys Arg Arg Glu Asp Glu Lys Ser Ile 1 5 10 15 Gln Ser Gln Glu Pro Lys Thr Thr Ser Leu Gln Lys Glu Leu Gly Leu 20 25 30 Ile Ser Gly Ile Ser Ile Ile Val Gly Thr Ile Ile Gly Ser Gly Ile 35 40 45 Phe Val Ser Pro Lys Ser Val Leu Ser Asn Thr Glu Ala Val Gly Pro 50 55 60 Cys Leu Ile Ile Trp Ala Ala Cys Gly Val Leu Ala Thr Leu Gly Ala 65 70 75 80 Leu Cys Phe Ala Glu Leu Gly Thr Met Ile Thr Lys Ser Gly Gly Glu 85 90 95 Tyr Pro Tyr Leu Met Glu Ala Tyr Gly Pro Ile Pro Ala Tyr Leu Phe 100 105 110 Ser Trp Ala Ser Leu Ile Val Ile Lys Pro Thr Ser Phe Ala Ile Ile 115 120 125 Cys Leu Ser Phe Ser Glu Tyr Val Cys Ala Pro Phe Tyr Val Gly Cys 130 135 140 Lys Pro Pro Gln Ile Val Val Lys Cys Leu Ala Ala Ala Ala Ile Leu 145 150 155 160 Phe Ile Ser Thr Val Asn Ser Leu Ser Val Arg Leu Gly Ser Tyr Val 165 170 175 Gln Asn Ile Phe Thr Ala Ala Lys Leu Val Ile Val Ala Ile Ile Ile 180 185 190 Ile Ser Gly Leu Val Leu Leu Ala Gln Gly Asn Thr Lys Asn Phe Asp 195 200 205 Asn Ser Phe Glu Gly Ala Gln Leu Ser Val Gly Ala Ile Ser Leu Ala 210 215 220 Phe Tyr Asn Gly Leu Trp Ala Tyr Asp Gly Trp Asn Gln Leu Asn Tyr 225 230 235 240 Ile Thr Glu Glu Leu Arg Asn Pro Tyr Arg Asn Leu Pro Leu Ala Ile 245 250 255 Ile Ile Gly Ile Pro Leu Val Thr Ala Cys Tyr Ile Leu Met Asn Val 260 265 270 Ser Tyr Phe Thr Val Met Thr Ala Thr Glu Leu Leu Gln Ser Gln Ala 275 280 285 Val Ala Val Thr Phe Gly Asp Arg Val Leu Tyr Pro Ala Ser Trp Ile 290 295 300 Val Pro Leu Phe Val Ala Phe Ser Thr Ile Gly Ala Ala Asn Gly Thr 305 310 315 320 Cys Phe Thr Ala Gly Arg Leu Ile Tyr Val Ala Gly Arg Glu Gly His 325 330 335 Met Leu Lys Val Leu Ser Tyr Ile Ser Val Arg Arg Leu Thr Pro Ala 340 345 350 Pro Ala Ile Ile Phe Tyr Gly Ile Ile Ala Thr Ile Tyr Ile Ile Pro 355 360 365 Gly Asp Ile Asn Ser Leu Val Asn Tyr Phe Ser Phe Ala Ala Trp Leu 370 375 380 Phe Tyr Gly Leu Thr Ile Leu Gly Leu Ile Val Met Arg Phe Thr Arg 385 390 395 400 Lys Glu Leu Glu Arg Pro Ile Lys Val Pro Val Val Ile Pro Val Leu 405 410 415 Met Thr Leu Ile Ser Val Phe Leu Val Leu Ala Pro Ile Ile Ser Lys 420 425 430 Pro Thr Trp Glu Tyr Leu Tyr Cys Val Leu Phe Ile Leu Ser Gly Leu 435 440 445 Leu Phe Tyr Phe Leu Phe Val His Tyr Lys Phe Gly Trp Ala Gln Lys 450 455 460 Ile Ser Lys Pro Ile Thr Met His Leu Gln Met Leu Met Glu Val Val 465 470 475 480 Pro Pro Glu Glu Asp Pro Glu 485 47 523 PRT Homo sapiens 47 Met Ala Gly His Thr Gln Gln Pro Ser Gly Arg Gly Asn Pro Arg Pro 1 5 10 15 Ala Pro Ser Pro Ser Pro Val Pro Gly Thr Val Pro Gly Ala Ser Glu 20 25 30 Arg Val Ala Leu Lys Lys Glu Ile Gly Leu Leu Ser Ala Cys Thr Ile 35 40 45 Ile Ile Gly Asn Ile Ile Gly Ser Gly Ile Phe Ile Ser Pro Lys Gly 50 55 60 Val Leu Glu His Ser Gly Ser Val Gly Leu Ala Leu Phe Val Trp Val 65 70 75 80 Leu Gly Gly Gly Val Thr Ala Leu Gly Ser Leu Cys Tyr Ala Glu Leu 85 90 95 Gly Val Ala Ile Pro Lys Ser Gly Gly Asp Tyr Ala Tyr Val Thr Glu 100 105 110 Ile Phe Gly Gly Leu Ala Gly Phe Leu Leu Leu Trp Ser Ala Val Leu 115 120 125 Ile Met Tyr Pro Thr Ser Leu Ala Val Ile Ser Met Thr Phe Ser Asn 130 135 140 Tyr Val Leu Gln Pro Val Phe Pro Asn Cys Ile Pro Pro Thr Thr Ala 145 150 155 160 Ser Arg Val Leu Ser Met Ala Cys Leu Met Leu Leu Thr Trp Val Asn 165 170 175 Ser Ser Ser Val Arg Trp Ala Thr Arg Ile Gln Asp Met Phe Thr Gly 180 185 190 Gly Lys Leu Leu Ala Leu Ser Leu Ile Ile Gly Val Gly Leu Leu Gln 195 200 205 Ile Phe Gln Gly His Phe Glu Glu Leu Arg Pro Ser Asn Ala Phe Ala 210 215 220 Phe Trp Met Thr Pro Ser Val Gly His Leu Ala Leu Ala Phe Leu Gln 225 230 235 240 Gly Ser Phe Ala Phe Ser Gly Trp Asn Phe Leu Asn Tyr Val Thr Glu 245 250 255 Glu Met Val Asp Ala Arg Lys Asn Leu Pro Arg Ala Ile Phe Ile Ser 260 265 270 Ile Pro Leu Val Thr Phe Val Tyr Thr Phe Thr Asn Ile Ala Tyr Phe 275 280 285 Thr Ala Met Ser Pro Gln Glu Leu Leu Ser Ser Asn Ala Val Ala Val 290 295 300 Thr Phe Gly Glu Lys Leu Leu Gly Tyr Phe Ser Trp Val Met Pro Val 305 310 315 320 Ser Val Ala Leu Ser Thr Phe Gly Gly Ile Asn Gly Tyr Leu Phe Thr 325 330 335 Tyr Ser Arg Leu Cys Phe Ser Gly Ala Arg Glu Gly His Leu Pro Ser 340 345 350 Leu Leu Ala Met Ile His Val Arg His Cys Thr Pro Ile Pro Ala Leu 355 360 365 Leu Val Cys Cys Gly Ala Thr Ala Val Ile Met Leu Val Gly Asp Thr 370 375 380 Tyr Thr Leu Ile Asn Tyr Val Ser Phe Ile Asn Tyr Leu Cys Tyr Gly 385 390 395 400 Val Thr Ile Leu Gly Leu Leu Leu Leu Arg Trp Arg Arg Pro Ala Leu 405 410 415 His Arg Pro Ile Lys Val Asn Leu Leu Ile Pro Val Ala Tyr Leu Val 420 425 430 Phe Trp Ala Phe Leu Leu Val Phe Ser Phe Ile Ser Glu Pro Met Val 435 440 445 Cys Gly Val Gly Val Ile Ile Ile Leu Thr Gly Val Pro Ile Phe Phe 450 455 460 Leu Gly Val Phe Trp Arg Ser Lys Pro Lys Cys Val His Arg Leu Thr 465 470 475 480 Glu Ser Met Thr His Trp Gly Gln Glu Leu Cys Phe Val Val Tyr Pro 485 490 495 Gln Asp Ala Pro Glu Glu Glu Glu Asn Gly Pro Cys Pro Pro Ser Leu 500 505 510 Leu Pro Ala Thr Asp Lys Pro Ser Lys Pro Gln 515 520 48 501 PRT Homo sapiens 48 Met Val Arg Lys Pro Val Val Ser Thr Ile Ser Lys Gly Gly Tyr Leu 1 5 10 15 Gln Gly Asn Val Asn Gly Arg Leu Pro Ser Leu Gly Asn Lys Glu Pro 20 25 30 Pro Gly Gln Glu Lys Val Gln Leu Lys Arg Lys Val Thr Leu Leu Arg 35 40 45 Gly Val Ser Ile Ile Ile Gly Thr Ile Ile Gly Ala Gly Ile Phe Ile 50 55 60 Ser Pro Lys Gly Val Leu Gln Asn Thr Gly Ser Val Gly Met Ser Leu 65 70 75 80 Thr Ile Trp Thr Val Cys Gly Val Leu Ser Leu Phe Gly Ala Leu Ser 85 90 95 Tyr Ala Glu Leu Gly Thr Thr Ile Lys Lys Ser Gly Gly His Tyr Thr 100 105 110 Tyr Ile Leu Glu Val Phe Gly Pro Leu Pro Ala Phe Val Arg Val Trp 115 120 125 Val Glu Leu Leu Ile Ile Arg Pro Ala Ala Thr Ala Val Ile Ser Leu 130 135 140 Ala Phe Gly Arg Tyr Ile Leu Glu Pro Phe Phe Ile Gln Cys Glu Ile 145 150 155 160 Pro Glu Leu Ala Ile Lys Leu Ile Thr Ala Val Gly Ile Thr Val Val 165 170 175 Met Val Leu Asn Ser Met Ser Val Ser Trp Ser Ala Arg Ile Gln Ile 180 185 190 Phe Leu Thr Phe Cys Lys Leu Thr Ala Ile Leu Ile Ile Ile Val Pro 195 200 205 Gly Val Met Gln Leu Ile Lys Gly Gln Thr Gln Asn Phe Lys Asp Ala 210 215 220 Phe Ser Gly Arg Asp Ser Ser Ile Thr Arg Leu Pro Leu Ala Phe Tyr 225 230 235 240 Tyr Gly Met Tyr Ala Tyr Ala Gly Trp Phe Tyr Leu Asn Phe Val Thr 245 250 255 Glu Glu Val Glu Asn Pro Glu Lys Thr Ile Pro Leu Ala Ile Cys Ile 260 265 270 Ser Met Ala Ile Val Thr Ile Gly Tyr Val Leu Thr Asn Val Ala Tyr 275 280 285 Phe Thr Thr Ile Asn Ala Glu Glu Leu Leu Leu Ser Asn Ala Val Ala 290 295 300 Val Thr Phe Ser Glu Arg Leu Leu Gly Asn Phe Ser Leu Ala Val Pro 305 310 315 320 Ile Phe Val Ala Leu Ser Cys Phe Gly Ser Met Asn Gly Gly Val Phe 325 330 335 Ala Val Ser Arg Leu Phe Tyr Val Ala Ser Arg Glu Gly His Leu Pro 340 345 350 Glu Ile Leu Ser Met Ile His Val Arg Lys His Thr Pro Leu Pro Ala 355 360 365 Val Ile Val Leu His Pro Leu Thr Met Ile Met Leu Phe Ser Gly Asp 370 375 380 Leu Asp Ser Leu Leu Asn Phe Leu Ser Phe Ala Arg Trp Leu Phe Ile 385 390 395 400 Gly Leu Ala Val Ala Gly Leu Ile Tyr Leu Arg Tyr Lys Cys Pro Asp 405 410 415 Met His Arg Pro Phe Lys Val Pro Leu Phe Ile Pro Ala Leu Phe Ser 420 425 430 Phe Thr Cys Leu Phe Met Val Ala Leu Ser Leu Tyr Ser Asp Pro Phe 435 440 445 Ser Thr Gly Ile Gly Phe Val Ile Thr Leu Thr Gly Val Pro Ala Tyr 450 455 460 Tyr Leu Phe Ile Ile Trp Asp Lys Lys Pro Arg Trp Phe Arg Ile Met 465 470 475 480 Ser Glu Lys Ile Thr Arg Thr Leu Gln Ile Ile Leu Glu Val Val Pro 485 490 495 Glu Glu Asp Lys Leu 500 49 501 PRT Homo sapiens 49 Met Val Arg Lys Pro Val Val Ser Thr Ile Ser Lys Gly Gly Tyr Leu 1 5 10 15 Gln Gly Asn Val Asn Gly Arg Leu Pro Ser Leu Gly Asn Lys Glu Pro 20 25 30 Pro Gly Gln Glu Lys Val Gln Leu Lys Arg Lys Val Thr Leu Leu Arg 35 40 45 Gly Val Ser Ile Ile Ile Gly Thr Ile Ile Gly Ala Gly Ile Phe Ile 50 55 60 Ser Pro Lys Gly Val Leu Gln Asn Thr Gly Ser Val Gly Met Ser Leu 65 70 75 80 Thr Ile Trp Thr Val Cys Gly Val Leu Ser Leu Phe Gly Ala Leu Ser 85 90 95 Tyr Ala Glu Leu Gly Thr Thr Ile Lys Lys Ser Gly Gly His Tyr Thr 100 105 110 Tyr Ile Leu Glu Val Phe Gly Pro Leu Pro Ala Phe Val Arg Val Trp 115 120 125 Val Glu Leu Leu Ile Ile Arg Pro Ala Ala Thr Ala Val Ile Ser Leu 130 135 140 Ala Phe Gly Arg Tyr Ile Leu Glu Pro Phe Phe Ile Gln Cys Glu Ile 145 150 155 160 Pro Glu Leu Ala Ile Lys Leu Ile Thr Ala Val Gly Ile Thr Val Val 165 170 175 Met Val Leu Asn Ser Met Ser Val Ser Trp Ser Ala Arg Ile Gln Ile 180 185 190 Phe Leu Thr Phe Cys Lys Leu Thr Ala Ile Leu Ile Ile Ile Val Pro 195 200 205 Gly Val Met Gln Leu Ile Lys Gly Gln Thr Gln Asn Phe Lys Asp Ala 210 215 220 Phe Ser Gly Arg Asp Ser Ser Ile Thr Arg Leu Pro Leu Ala Phe Tyr 225 230 235 240 Tyr Gly Met Tyr Ala Tyr Ala Gly Trp Phe Tyr Leu Asn Phe Val Thr 245 250 255 Glu Glu Val Glu Asn Pro Glu Lys Thr Ile Pro Leu Ala Ile Cys Ile 260 265 270 Ser Met Ala Ile Val Thr Ile Gly Tyr Val Leu Thr Asn Val Ala Tyr 275 280 285 Phe Thr Thr Ile Asn Ala Glu Glu Leu Leu Leu Ser Asn Ala Val Ala 290 295 300 Val Thr Phe Ser Glu Arg Leu Leu Gly Asn Phe Ser Leu Ala Val Pro 305 310 315 320 Ile Phe Val Ala Leu Ser Cys Phe Gly Ser Met Asn Gly Gly Val Phe 325 330 335 Ala Val Ser Arg Leu Phe Tyr Val Ala Ser Arg Glu Gly His Leu Pro 340 345 350 Glu Ile Leu Ser Met Ile His Val Arg Lys His Thr Pro Leu Pro Ala 355 360 365 Val Ile Val Leu His Pro Leu Thr Met Ile Met Leu Phe Ser Gly Asp 370 375 380 Leu Asp Ser Leu Leu Asn Phe Leu Ser Phe Ala Arg Trp Leu Phe Ile 385 390 395 400 Gly Leu Ala Val Ala Gly Leu Ile Tyr Leu Arg Tyr Lys Cys Pro Asp 405 410 415 Met His Arg Pro Phe Lys Val Pro Leu Phe Ile Pro Ala Leu Phe Ser 420 425 430 Phe Thr Cys Leu Phe Met Val Ala Leu Ser Leu Tyr Ser Asp Pro Phe 435 440 445 Ser Thr Gly Ile Gly Phe Val Ile Thr Leu Thr Gly Val Pro Ala Tyr 450 455 460 Tyr Leu Phe Ile Ile Trp Asp Lys Lys Pro Arg Trp Phe Arg Ile Met 465 470 475 480 Ser Glu Lys Ile Thr Arg Thr Leu Gln Ile Ile Leu Glu Val Val Pro 485 490 495 Glu Glu Asp Lys Leu 500 50 180 PRT Homo sapiens 50 Met Ala Gly Ala Gly Pro Lys Arg Arg Ala Leu Ala Ala Pro Val Ala 1 5 10 15 Glu Glu Lys Glu Glu Ala Arg Glu Lys Ile Met Ala Ala Lys Arg Ala 20 25 30 Asp Gly Ala Ala Pro Ala Gly Glu Gly Glu Gly Val Thr Leu Gln Gly 35 40 45 Asn Ile Thr Leu Leu Lys Gly Val Ala Val Ile Val Val Ala Ile Met 50 55 60 Gly Ser Gly Ile Phe Val Thr Pro Thr Gly Val Leu Lys Glu Ala Gly 65 70 75 80 Ser Pro Gly Leu Ala Leu Val Val Trp Ala Ala Cys Gly Val Phe Ser 85 90 95 Ile Val Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr Ile Ser Lys 100 105 110 Ser Gly Gly Asp Tyr Ala Tyr Met Leu Asp Val Tyr Gly Ser Leu Pro 115 120 125 Ala Phe Leu Lys Leu Trp Ile Glu Leu Leu Ile Ile Arg Pro Ser Ser 130 135 140 Gln Tyr Ile Val Ala Leu Val Phe Ala Thr Tyr Leu Leu Lys Pro Leu 145 150 155 160 Phe Pro Thr Cys Pro Val Pro Glu Glu Ala Ala Lys Leu Val Ala Cys 165 170 175 His Ser Val His 180 51 1541 DNA Homo sapiens 51 cggccggtgc gcagagcatg gcgggtgcgg gcccgaagcg gcgcgcgcta gcggcgccgg 60 cggccgagga gaaggaagag gcgcgggaga agatgctggc cgccaagagc gcggacggct 120 cggcgccggc aggcgagggc gagggcgtga ccctgcagcg gaacatcacg ctgctcaacg 180 gcgtggccat catcgtgggg accattatcg gctcgggcat cttcgtgacg cccacgggcg 240 tgctcaagga ggcaggctcg ccggggctgg cgctggtggt gtgggccgcg tgcggcgtct 300 tctccatcgt gggcgcgctc tgctacgcgg agctcggcac caccatctcc aaatcgggcg 360 gagactacgc ctacatgctg gaggtctacg gctcgctgcc cgccttcctc aagctctgga 420 tcgagctgct catcatccgg ccttcatcgc agtacatcgt ggccctggtc ttcgccacct 480 acctgctcaa gccgctcttc cccacctgcc cggtgcccga ggaggcagcc aagctcgtgg 540 cctgcctctg cgtgctgctg ctcacggccg tgaactgcta cagcgtgaag gccgccaccc 600 gggtccagga tgcctttgcc gccgccaagc tcctggccct ggccctgatc atcctgctgg 660 gcttcgtcca gatcgggaag ggtgatgtgt ccaatctaga tcccaacttc tcatttgaag 720 gcaccaaact ggatgtgggg aacattgtgc tggcattata cagcggcctc tttgcctatg 780 gaggatggaa ttacttgaat ttcgtcacag aggaaatgat caacccctac agaaacctgc 840 ccctggccat catcatctcc ctgcccatcg tgacgctggt gtacgtgctg accaacctgg 900 cctacttcac caccctgtcc accgagcaga tgctgtcgtc cgaggccgtg gccgtggact 960 tcgggaacta tcacctgggc gtcatgtcct ggatcatccc cgtcttcgtg ggcctgtcct 1020 gcttcggctc cgtcaatggg tccctgttca catcctccag gctcttcttc gtggggtccc 1080 gggaaggcca cctgccctcc atcctctcca tgatccaccc acagctcctc acccccgtgc 1140 cgtccctcgt gttcacgtgt gtgatgacgc tgctctacgc cttctccaag gacatcttct 1200 ccgtcatcaa cttcttcagc ttcttcaact ggctctgcgt ggccctggcc atcatcggca 1260 tgatctggct gcgccacaga aagcctgagc ttgagcggcc catcaaggtg aacctggccc 1320 tgcctgtgtt cttcatcctg gcctgcctct tcctgatcgc cgtctccttc tggaagacac 1380 ccgtggagtg tggcatcggc ttcaccatca tcctcagcgg gctgcccgtc tacttcttcg 1440 gggtctggtg gaaaaacaag cccaagtggc tcctccaggg catcttctcc acgaccgtcc 1500 tgtgtcagaa gctcatgcag gtggtccccc aggagacata g 1541 52 1528 DNA Homo sapiens 52 caccgaattc tgtgtcccta ctatggtcag aaagcctgtt gtgtccacca tctccaaagg 60 aggttacctg cagggaaatg ttaacgggag gctgccttcc ctgggcaaca aggagccacc 120 tgggcaggag aaagtgcagc tgaagaggaa agtcacttta ctgaggggag tctccattat 180 cattggcacc atcattggag caggaatctt catctctcct aagggcgtgc tccagaacac 240 gggcagcgtg ggcatgtctc tgaccatctg gacggtgtgt ggggtcctgt cactatttgg 300 agctttgtct tatgctgaat tgggaacaac tataaagaaa tctggaggtc attacacata 360 tattttggaa gtctttggtc cattaccagc ttttgtacga gtctgggtgg aactcctcat 420 aatacgccct gcagctactg ctgtgatatc cctggcattt ggacgctaca ttctggaacc 480 attttttatt caatgtgaaa tccctgaact tgcgatcaag ctcattacag ctgtgggcat 540 aactgtagtg atggtcctaa atagcatgag tgtcagctgg agcgcccgga tccagatttt 600 cttaaccttt tgcaagctca cagcaattct gataattata gtccctggag ttatgcagct 660 aattaaaggt caaacgcaga actttaaaga cgccttttca ggaagagatt caagtattac 720 gcggttgcca ctggcttttt attatggaat gtatgcatat gctggctggt tttacctcaa 780 ctttgttact gaagaagtag aaaaccctga aaaaaccatt ccccttgcaa tatgtatatc 840 catggccatt gtcaccattg gctatgtgct gacaaatgtg gcctacttta cgaccattaa 900 tgctgaggag ctgctgcttt caaatgcagt ggcagtgacc ttttctgagc ggctactggg 960 aaatttctca ttagcagttc cgatctttgt tgccctctcc tgctttggct ccatgaacgg 1020 tggtgtgttt gctgtctcca ggttattcta tgttgcgtct cgagagggtc accttccaga 1080 aatcctctcc atgattcatg tccgcaagca cactcctcta ccagctgtta ttgttttgca 1140 ccctttgaca atgataatgc tcttctctgg agacctcgac agtcttttga atttcctcag 1200 ttttgccagg tggcttttta ttgggctggc agttgctggg ctgatttatc ttcgatacaa 1260 atgcccagat atgcatcgtc ctttcaaggt gcgactgttc atcccagctt tgttttcctt 1320 cacatgcctc ttcatggttg ccctttccct ctattcggac ccatttagta cagggattgg 1380 cttcgtcatc actctgactg gagtccctgc gtattatctc tttattatat gggacaagaa 1440 acccaggtgg tttagaataa tgtcggagaa aataaccaga acattacaaa taatactgga 1500 agttgtacca gaagaagata agttatga 1528 53 1268 DNA Homo sapiens 53 caccgaattc tgtgtcccta ctatggtcag aaagcctgtt gtgtccacca tctccaaagg 60 aggttacctg cagggaaatg ttaacgggag gctgccttcc ctgggcaaca aggagccacc 120 tgggcaggag aaagtgcagc tgaagaggaa agtcacttta ctgaggggag tctccattat 180 cattggcacc atcattggag caggaatctt catctctcct aagggcgtgc tccagaacac 240 gggcagcgtg ggcatgtctc tgaccatctg gacggtgtgt ggggtcctgt cactatttgg 300 agctttgtct tatgctgaat tgggaacaac tataaagaaa tctggaggtc attacacata 360 tattttggaa gtctttggtc cattaccagc ttttgtacga gtctgggtgg aactcctcat 420 aatacgccct gcagctactg ctgtgatatc cctggcattt ggacgctaca ttctggaacc 480 attttttatt caatgtgaaa tccctgaact tgcgatcaag ctcattacag ctgtgggcat 540 aactgtagtg atggtcctaa atagcatgag tgtcagctgg agcgcccgga tccagatttt 600 cttaaccttt tgcaagctca cagcaattct gataattata gtccctggag ttatgcagct 660 aattaaaggt caaacgcaga actttaaaga cgccttttca ggaagagatt caagtattac 720 gcggttgcca ctggcttttt attatggaat gtatgcatat gctggctggt tttacctcaa 780 ctttgttact gaagaagtag aaaaccctga aaaaaccatt ccccttgcaa tatgtatatc 840 catggccatt gtcaccattg gctatgtgct gacaaatgtg gcctacttta cgaccattaa 900 tgctgaggag ctgctgcttt caaatgcagt ggcagtgacc ttttctgagc ggctactggg 960 aaatttctca ttagcagttc cgatctttgt tgccccctcc tctaccagct gttattgttt 1020 tgcacccttt gacaatgata atgctcttct ctggagacct cgacagtctt ttgaatttcc 1080 tcagttttgc caggtggctt tttattgggc tggcagttgc tgggctgatt tatcttcgat 1140 acaaatgccc agatatgcat cgtcctttca aggtgccact gttcatccca gctttgtttt 1200 ccttcacatg cctcttcatg gttgcccttt ccctctattc ggacccattt agtacaggga 1260 ttggcttc 1268 54 507 PRT Homo sapiens 54 Met Ala Gly Ala Gly Pro Lys Arg Arg Ala Leu Ala Ala Pro Ala Ala 1 5 10 15 Glu Glu Lys Glu Glu Ala Arg Glu Lys Met Leu Ala Ala Lys Ser Ala 20 25 30 Asp Gly Ser Ala Pro Ala Gly Glu Gly Glu Gly Val Thr Leu Gln Arg 35 40 45 Asn Ile Thr Leu Leu Asn Gly Val Ala Ile Ile Val Gly Thr Ile Ile 50 55 60 Gly Ser Gly Ile Phe Val Thr Pro Thr Gly Val Leu Lys Glu Ala Gly 65 70 75 80 Ser Pro Gly Leu Ala Leu Val Val Trp Ala Ala Cys Gly Val Phe Ser 85 90 95 Ile Val Gly Ala Leu Cys Tyr Ala Glu Leu Gly Thr Thr Ile Ser Lys 100 105 110 Ser Gly Gly Asp Tyr Ala Tyr Met Leu Glu Val Tyr Gly Ser Leu Pro 115 120 125 Ala Phe Leu Lys Leu Trp Ile Glu Leu Leu Ile Ile Arg Pro Ser Ser 130 135 140 Gln Tyr Ile Val Ala Leu Val Phe Ala Thr Tyr Leu Leu Lys Pro Leu 145 150 155 160 Phe Pro Thr Cys Pro Val Pro Glu Glu Ala Ala Lys Leu Val Ala Cys 165 170 175 Leu Cys Val Leu Leu Leu Thr Ala Val Asn Cys Tyr Ser Val Lys Ala 180 185 190 Ala Thr Arg Val Gln Asp Ala Phe Ala Ala Ala Lys Leu Leu Ala Leu 195 200 205 Ala Leu Ile Ile Leu Leu Gly Phe Val Gln Ile Gly Lys Gly Asp Val 210 215 220 Ser Asn Leu Asp Pro Asn Phe Ser Phe Glu Gly Thr Lys Leu Asp Val 225 230 235 240 Gly Asn Ile Val Leu Ala Leu Tyr Ser Gly Leu Phe Ala Tyr Gly Gly 245 250 255 Trp Asn Tyr Leu Asn Phe Val Thr Glu Glu Met Ile Asn Pro Tyr Arg 260 265 270 Asn Leu Pro Leu Ala Ile Ile Ile Ser Leu Pro Ile Val Thr Leu Val 275 280 285 Tyr Val Leu Thr Asn Leu Ala Tyr Phe Thr Thr Leu Ser Thr Glu Gln 290 295 300 Met Leu Ser Ser Glu Ala Val Ala Val Asp Phe Gly Asn Tyr His Leu 305 310 315 320 Gly Val Met Ser Trp Ile Ile Pro Val Phe Val Gly Leu Ser Cys Phe 325 330 335 Gly Ser Val Asn Gly Ser Leu Phe Thr Ser Ser Arg Leu Phe Phe Val 340 345 350 Gly Ser Arg Glu Gly His Leu Pro Ser Ile Leu Ser Met Ile His Pro 355 360 365 Gln Leu Leu Thr Pro Val Pro Ser Leu Val Phe Thr Cys Val Met Thr 370 375 380 Leu Leu Tyr Ala Phe Ser Lys Asp Ile Phe Ser Val Ile Asn Phe Phe 385 390 395 400 Ser Phe Phe Asn Trp Leu Cys Val Ala Leu Ala Ile Ile Gly Met Ile 405 410 415 Trp Leu Arg His Arg Lys Pro Glu Leu Glu Arg Pro Ile Lys Val Asn 420 425 430 Leu Ala Leu Pro Val Phe Phe Ile Leu Ala Cys Leu Phe Leu Ile Ala 435 440 445 Val Ser Phe Trp Lys Thr Pro Val Glu Cys Gly Ile Gly Phe Thr Ile 450 455 460 Ile Leu Ser Gly Leu Pro Val Tyr Phe Phe Gly Val Trp Trp Lys Asn 465 470 475 480 Lys Pro Lys Trp Leu Leu Gln Gly Ile Phe Ser Thr Thr Val Leu Cys 485 490 495 Gln Lys Leu Met Gln Val Val Pro Gln Glu Thr 500 505 

What is claimed is:
 1. A method of identifying an SLC7 modulating agent, said method comprising the steps of: (a) providing an assay system comprising a purified SLC7 polypeptide or nucleic acid or a functionally active fragment or derivative thereof; (b) contacting the assay system with a test agent under conditions whereby, but for the presence of the test agent, the system provides a reference activity; and (c) detecting a test agent-biased activity of the assay system, wherein a difference between the test agent-biased activity and the reference activity identifies the test agent as an SLC7-modulating agent.
 2. The method of claim 1 wherein the SLC7 polypeptide or nucleic acid is SLC7A5.
 3. The method of claim 1 wherein the SLC7 polypeptide or nucleic acid is SLC7A11.
 4. The method of claim 1 wherein the assay system comprises cultured cells that express the SLC7 polypeptide.
 5. The method of claim 4 wherein the cultured cells additionally have defective p53 function.
 6. The method of claim 1 wherein the assay system includes a screening assay comprising a SLC7 polypeptide, and the candidate test agent is a small molecule modulator.
 7. The method of claim 6 wherein the assay is a transporter assay.
 8. The method of claim 1 wherein the assay system is selected from the group consisting of an apoptosis assay system, a cell proliferation assay system, an angiogenesis assay system, and a hypoxic induction assay system.
 9. The method of claim 1 wherein the assay system includes a binding assay comprising a SLC7 polypeptide and the candidate test agent is an antibody.
 10. The method of claim 1 wherein the assay system includes an expression assay comprising a SLC7 nucleic acid and the candidate test agent is a nucleic acid modulator.
 11. The method of claim 10 wherein the nucleic acid modulator is an antisense oligomer.
 12. The method of claim 10 wherein the nucleic acid modulator is a PMO.
 13. The method of claim 1 additionally comprising: (d) administering the SLC7-modulating agent identified in (c) to a model system comprising cells defective in p53 function and, detecting a phenotypic change in the model system that indicates that the p53 function is restored, wherein restoration of p53 function identifies the SLC7-modulating agent as a p53 modulating agent..
 14. The method of claim 13 wherein the model system is a mouse model with defective p53 function.
 15. A method for modulating SLC7 function in a mammalian cell comprising contacting the cell with an SLC7 modulating agent.
 16. The method of claim 15 wherein the SLC7 modulating agent modulates an SLC7A5 polypeptide or nucleic acid.
 17. The method of claim 15 wherein the SLC7 modulating agent modulates an SLC7A11 polypeptide or nucleic acid.
 18. The method of claim 15 wherein said cell has defective p53 function, and said SLC7 modulating agent restores p53 function.
 19. The method of claim 15 wherein the SLC7 modulating agent specifically modulates a SLC7 polypeptide comprising an amino acid sequence selected from group consisting of SEQ ID NOs: 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, and
 54. 20. The method of claim 15 wherein the SLC7-modulating agent is administered to a vertebrate animal predetermined to have a disease or disorder resulting from a defect in p53 function.
 21. The method of claim 15 wherein the SLC7-modulating agent is selected from the group consisting of an antibody and a small molecule.
 22. The method of claim 1, comprising the additional steps of: (d) providing a secondary assay system that measures changes in p53 function, wherein said secondary assay system comprises cultured cells or a non-human animal expressing SLC7, (e) contacting the secondary assay system with the test agent of (b) or an agent derived therefrom under conditions whereby, but for the presence of the test agent or agent derived therefrom, the system provides a reference activity indicative of p53 function; and (f) detecting an agent-biased activity of the secondary assay system, wherein a difference between the agent-biased activity and the reference activity of the secondary assay system identifies the test agent or agent derived therefrom as a candidate p53 pathway modulating agent.
 23. The method of claim 22 wherein the secondary assay system comprises cultured cells.
 24. The method of claim 22 wherein the secondary assay system comprises a non-human animal.
 25. The method of claim 24 wherein the non-human animal mis-expresses a p53 pathway gene.
 26. A method of modulating p53 pathway in a mammalian cell comprising contacting the cell with an SLC7-modulating agent that modulates the p53 pathway.
 27. The method of claim 24 wherein the agent is administered to a mammalian animal predetermined to have a pathology associated with the p53 pathway.
 28. The method of claim 24 wherein the SLC7-modulating agent is selected from the group consisting of a small molecule modulator, a nucleic acid modulator, and an antibody modulator.
 29. A method for diagnosing a disease or disorder associated with alterations in SLC7 expression comprising: (a) obtaining a biological sample from the patient; (b) contacting the sample with a probe for SLC7 expression; (c) comparing results from step (b) with a control; (d) determining whether step (c) indicates a likelihood of the disease or disorder.
 30. The method of claim 27 wherein said disease or disorder is cancer.
 31. The method according to claim 28, wherein said cancer is a cancer as shown in Table 1 as having >25% expression level.
 32. The method according to claim 27 wherein the probe is specific for SLC7A5 expression.
 33. The method according to claim 27 wherein the probe is specific for SLC7A11 expression.
 34. A method for treating a disorder associated with impaired SLC7 function that comprises administering a therapeutically effective amount of a SLC7 modulating agent, whereby SLC7 function is restored.
 35. The method of claim 32 wherein the impaired SLC7 function is attributable to an overexpression of SLC7.
 36. The method of claim 32 wherein the impaired SLC7 function is attributable to an underexpression of SLC7.
 37. The method of claim 32 wherein the impaired SLC7 function is attributable to impairedSLC7A5.
 38. The method of claim 32 wherein the impaired SLC7 function is attributable to impaired SLC7A11.
 39. A method for treating a disorder associated with impaired p53 function that comprises administering a therapeutically effective amount of a SLC7 modulating agent, whereby p53 function is restored.
 40. The method of claim 37 wherein the impaired p53 function is attributable to an overexpression of p53.
 41. The method of claim 37 wherein the impaired p53 function is attributable to an underexpression of p53.
 42. The method of claim 37 wherein the SLC7 modulating agent specifically modulates SLC7A5.
 43. The method of claim 37 wherein the SLC7 modulating agent specifically modulates SLC7A
 11. 